Elasticsearch添加中文分词
安装IK分词插件
从GitHub上下载项目(我下载到了/tmp下),并解压
cd /tmpwget https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zipunzip master.zip进入elasticsearch-analysis-ik-master
cd elasticsearch-analysis-ik/然后使用mvn命令,编译出jar包,elasticsearch-analysis-ik-1.4.0.jar,这个过程可能需要多尝试几次才能成功
mvn package顺便说一下,mvn需要安装maven,在Ubuntu上,安装maven的命令如下
apt-cache search mavensudo apt-get install mavenmvn -version将elasticsearch-analysis-ik-master/下的ik文件夹复制到${ES_HOME}/config/下
将elasticsearch-analysis-ik-master/target下的elasticsearch-analysis-ik-1.4.0.jar复制到${ES_HOME}/lib下
在${ES_HOME}/config/下的配置文件elasticsearch.yml中增加ik的配置,在最后增加
index: analysis: analyzer:ik: alias: [ik_analyzer] type: org.elasticsearch.index.analysis.IkAnalyzerProvider ik_max_word: type: ik use_smart: false ik_smart: type: ik use_smart: trueindex.analysis.analyzer.default.type: ik同时,还需要在${ES_HOME}/lib中引入httpclient-4.3.5.jar和httpcore-4.3.2.jar
IK分词测试
创建一个索引,名为index
curl -XPUT http://localhost:9200/index为索引index创建mapping
curl -XPOST http://localhost:9200/index/fulltext/_mapping -d ' { "fulltext": { "_all": {"analyzer": "ik" }, "properties": {"content": { "type" : "string", "boost" : 8.0, "term_vector" : "with_positions_offsets", "analyzer" : "ik", "include_in_all" : true} } }}'测试
curl 'http://localhost:9200/index/_analyze?analyzer=ik&pretty=true' -d '{ "text":"世界如此之大"}'{ "tokens" : [ { "token" : "text", "start_offset" : 4, "end_offset" : 8, "type" : "ENGLISH", "position" : 1 }, { "token" : "世界", "start_offset" : 11, "end_offset" : 13, "type" : "CN_WORD", "position" : 2 }, { "token" : "如此之", "start_offset" : 13, "end_offset" : 16, "type" : "CN_WORD", "position" : 3 }, { "token" : "如此", "start_offset" : 13, "end_offset" : 15, "type" : "CN_WORD", "position" : 4 }, { "token" : "之大", "start_offset" : 15, "end_offset" : 17, "type" : "CN_WORD", "position" : 5 } ]}

