This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
IK Analysis for ElasticSearch ================================== The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary. Version ------------- master | 0.19.4 -> master 1.1.0 | 0.19.4 -> master 1.0.0 | 0.16.2 -> 0.19.0 Install ------------- In order to install the plugin, simply run: <pre> cd bin plugin -install medcl/elasticsearch-analysis-ik/1.1.0 </pre> also download the dict files,unzip these dict file to your elasticsearch's config folder,such as: your-es-root/config/ik <pre> cd config wget http://github.com/downloads/medcl/elasticsearch-analysis-ik/ik.zip --no-check-certificate unzip ik.zip rm ik.zip </pre> you need a service restart after that! Dict Configuration (es-root/config/ik/IKAnalyzer.cfg.xml) ------------- https://github.com/medcl/elasticsearch-analysis-ik/blob/master/config/ik/IKAnalyzer.cfg.xml <pre> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>IK Analyzer 扩展配置</comment> <!--用户可以在这里配置自己的扩展字典 --> <entry key="ext_dict">custom/mydict.dic;custom/sougou.dict</entry> <!--用户可以在这里配置自己的扩展停止词字典--> <entry key="ext_stopwords">custom/ext_stopword.dic</entry> </properties> </pre> Analysis Configuration (elasticsearch.yml) ------------- <Pre> index: analysis: analyzer: ik: alias: [ik_analyzer] type: org.elasticsearch.index.analysis.IkAnalyzerProvider </pre> Or <pre> index.analysis.analyzer.ik.type : "ik" </pre> Mapping Configuration ------------- Here is a quick example: 1.create a index <pre> curl -XPUT http://localhost:9200/index </pre> 2.create a mapping <pre> curl -XPOST http://localhost:9200/index/fulltext/_mapping -d' { "fulltext": { "_all": { "indexAnalyzer": "ik", "searchAnalyzer": "ik", "term_vector": "no", "store": "false" }, "properties": { "content": { "type": "string", "store": "no", "term_vector": "with_positions_offsets", "indexAnalyzer": "ik", "searchAnalyzer": "ik", "include_in_all": "true", "boost": 8 } } } }' </pre> 3.index some docs <pre> curl -XPOST http://localhost:9200/index/fulltext/1 -d' {content:"美国留给伊拉克的是个烂摊子吗"} ' curl -XPOST http://localhost:9200/index/fulltext/2 -d' {content:"公安部:各地校车将享最高路权"} ' curl -XPOST http://localhost:9200/index/fulltext/3 -d' {content:"中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"} ' curl -XPOST http://localhost:9200/index/fulltext/4 -d' {content:"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"} ' </pre> 4.query with highlighting <pre> curl -XPOST http://localhost:9200/index/fulltext/_search -d' { "query" : { "term" : { "content" : "中国" }}, "highlight" : { "pre_tags" : ["<tag1>", "<tag2>"], "post_tags" : ["</tag1>", "</tag2>"], "fields" : { "content" : {} } } } ' </pre> here is the query result <pre> { "took": 14, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2, "max_score": 2, "hits": [ { "_index": "index", "_type": "fulltext", "_id": "4", "_score": 2, "_source": { "content": "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首" }, "highlight": { "content": [ "<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首 " ] } }, { "_index": "index", "_type": "fulltext", "_id": "3", "_score": 2, "_source": { "content": "中韩渔警冲突调查:韩警平均每天扣1艘中国渔船" }, "highlight": { "content": [ "均每天扣1艘<tag1>中国</tag1>渔船 " ] } } ] } } </pre> have fun.
Description
The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.
Readme
9.1 MiB
Languages
Java
100%