php - elasticsearch中将content字段设置为ik分词器后再使用 terms 聚合生成类似热门词汇的功能

 手机用户2602886105 发布于 2022-11-28 06:42

索引的msg2017-04201038447mapping

{"msg2017-04":{"mappings":{"201038447":{"properties":{"@timestamp":{"type":"date"},"content":{"type":"text","boost":8,"analyzer":"ik_smart","include_in_all":true},"createTime":{"type":"date"}}}}}}

索引的settings

{"msg2017-04":{"settings":{"index":{"creation_date":"1492398234434","number_of_shards":"5","number_of_replicas":"1","uuid":"yiGoDhL1T3WLexG79e5uQg","version":{"created":"5020299"},"provided_name":"msg2017-04"}}}}

环境:

  • linux

  • elasticsearch5.2.2

  • 已安装ik分词

分词结果

//requestGET/msg2017-04/_search?pretty{"size":1,"aggs":{"fenci":{"terms":{"field":"content.ik_smart"}}}}//response{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":105,"max_score":1,"hits":[{"_index":"msg2017-04","_type":"7510570179@chatroom","_id":"5067959408840553063","_score":1,"_source":{"wxid":"wxid_1idf7gf5jgh822","msgId":"69","msgSvrId":"5067959408840553063","type":0,"isSend":"1","status":"2","speakerId":"","content":"rhh","imei":"867464024215618","room":"7510570179@chatroom","roomName":"和湖光山色hzhzh","roomOwner":"mikezhangsky","roomMembers":"mikezhangsky;wxid_1idf7gf5jgh822;wxid_j56srpxywn5n22;wxid_90uy0wlz229e22;sun461629376","roomSize":"5","createTime":"2017-04-07T03:08:37","@timestamp":"2017-04-17T03:14:15"}}]},"aggregations":{"fenci":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[]}}}

我想通过中文分词后再聚合,这样就可以实时的统计出一段时间内的热词,类似微博的热搜。

撰写答案
今天,你开发时遇到什么问题呢?
立即提问
热门标签
PHP1.CN | 中国最专业的PHP中文社区 | PNG素材下载 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有