我在理解ES查询系统的基础方面遇到了很多问题.
我有以下查询例如:
{ "size": 0, "query": { "bool": { "must": [ { "term": { "referer": "www.xx.yy.com" } }, { "range": { "@timestamp": { "gte": "now", "lt": "now-1h" } } } ] } }, "aggs": { "interval": { "date_histogram": { "field": "@timestamp", "interval": "0.5h" }, "aggs": { "what": { "cardinality": { "field": "host" } } } } } }
该请求获得了太多结果:
"status":500,"reason":"ElasticsearchException [org.elasticsearch.common.breaker.CircuitBreakingException:数据太大,字段[@timestamp]的数据将大于[3200306380/2.9gb]的限制];嵌套: UncheckedExecutionException [org.elasticsearch.common.breaker.CircuitBreakingException:数据太大,字段[@timestamp]的数据将大于[3200306380/2.9gb]的限制];嵌套:CircuitBreakingException [数据太大,字段数据[@时间戳]将大于[3200306380/2.9gb]]的限制;"
我试过了这个请求:
{ "size": 0, "filter": { "and": [ { "term": { "referer": "www.geoportail.gouv.fr" } }, { "range": { "@timestamp": { "from": "2014-10-04", "to": "2014-10-05" } } } ] }, "aggs": { "interval": { "date_histogram": { "field": "@timestamp", "interval": "0.5h" }, "aggs": { "what": { "cardinality": { "field": "host" } } } } } }
我想过滤数据,以便能够得到正确的结果,任何帮助将不胜感激!
你可以先尝试清除缓存,然后执行上面的查询如图所示这里.
另一种解决方案可能是删除查询中的间隔或缩短时间范围...
我最好的选择是首先清除缓存,或者为elasticsearch分配更多内存(更多这里)
我找到了一个解决方案,这有点奇怪.我跟着dimzak建议并清除缓存:
curl --noproxy localhost -XPOST "http://localhost:9200/_cache/clear"
然后我使用过滤而不是查询Olly建议:
{ "size": 0, "query": { "filtered": { "query": { "term": { "referer": "www.xx.yy.fr" } }, "filter" : { "range": { "@timestamp": { "from": "2014-10-04T00:00", "to": "2014-10-05T00:00" } } } } }, "aggs": { "interval": { "date_histogram": { "field": "@timestamp", "interval": "0.5h" }, "aggs": { "what": { "cardinality": { "field": "host" } } } } } }
我不能给你这两个ansxwer,我认为dimzak应该得到最好的,但是竖起大拇指给你们两个人:)