我的文档结构类似于:
{ title: string, description: string, privacy_mode: string, hidden: boolean, added_by: string, topics: array }
我正在尝试查询elasticsearch.但是我不想要任何带有空主题数组字段的文档.
下面是一个构建查询对象的函数:
function getQueryObject(data) { var orList = [{ "term": {"privacy_mode": "public", "hidden": false} }] if (data.user) { orList.push({ "term": {"added_by": data.user} }); } var queryObj = { "fields": ["title", "topics", "added_by", "img_url", "url", "type"], "query": { "filtered" : { "query" : { "multi_match" : { "query" : data.query + '*', "fields" : ["title^4", "topics", "description^3", "tags^2", "body^2", "keywords", "entities", "_id"] } }, "filter" : { "or": orList }, "filter" : { "limit" : {"value" : 15} }, "filter": { "script": { "script": "doc['topics'].values.length > 0" } } } } } return queryObj; };
这仍然给我带有空主题数组的元素.想知道什么是错的!
谢谢你的帮助
您可能想要丢失过滤器.你的脚本方法会将所有主题值加载到内存中,如果你不是像他们那样面对,那将是非常浪费的.
此外,您的过滤器的结构是错误的.你不能有重复的值filter
,但应该用bool
-filter 包装它们.(这就是为什么你通常想要使用bool
而不是and|or|not
:http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
最后,您可能希望size
在搜索对象上指定,而不是使用limit
-filter.
我做了一个可以玩的可运行的例子:https://www.found.no/play/gist/aa59b987269a24feb763
#!/bin/bash export ELASTICSEARCH_ENDPOINT="http://localhost:9200" # Index documents curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d ' {"index":{"_index":"play","_type":"type"}} {"privacy_mode":"public","topics":["foo","bar"]} {"index":{"_index":"play","_type":"type"}} {"privacy_mode":"private","topics":[]} ' # Do searches curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d ' { "query": { "filtered": { "filter": { "bool": { "must": [ { "term": { "privacy_mode": "public" } } ], "must_not": [ { "missing": { "field": "topics" } } ] } } } } } '
关键字missing
是从ES5.0开始删除,它建议使用exists
(见这里):
curl -XGET 'localhost:9200/_search?pretty' -H 'Content-Type: application/json' -d' { "query": { "bool": { "must_not": { "exists": { "field": "topics" } } } } }'