故事的开始:统计每小时的记录数
之前使用 mysql 的时候,觉得聚合很简单, groud by 就行了,最近开始用 elasticsearch 的时候才发现关系型数据库是有多么的友好o(╥﹏╥)o
首先在es 里面有个 terms (桶)的概念,根据指定字段把查询到的结果分到每一个桶里面,相当于 mysql 的 groud by
Elasticsearch: 权威指南 » 聚合 » 高阶概念
简单分组如下:
{
"size": 0,
"aggs": {
"record": {
"terms": {
"field": "belongParkId",
"size": 5
}
}
}
}
注:record 为自定义结果名字
查询结果如下:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 605,
"max_score": 0,
"hits": []
},
"aggregations": {
"record": {
"doc_count_error_upper_bound": 10,
"sum_other_doc_count": 514,
"buckets": [
{
"key": 97,
"doc_count": 27
},
{
"key": 194,
"doc_count": 26
},
{
"key": 175,
"doc_count": 15
},
{
"key": 26,
"doc_count": 12
},
{
"key": 133,
"doc_count": 11
}
]
}
}
}
然后开始了今日重点:按小时排序 Elasticsearch: 权威指南 » 聚合 » 按时间统计
{
"size": 0,
"aggs": {
"record": {
"date_histogram": {
"field": "updateTime",
"interval": "hour",
"time_zone": "+08:00",
"format": "yyyy-MM-dd HH"
}
}
}
}
注:record 为自定义结果名字
查询结果如下:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 605,
"max_score": 0,
"hits": []
},
"aggregations": {
"record": {
"buckets": [
{
"key_as_string": "2021-10-27 15",
"key": 1635346800000,
"doc_count": 605
}
]
}
}
}
需要注意的时 date_histogram 函数只能是 date 类型的字段使用
interval字段支持多种关键字:year
,quarter
,month
,week
,day
,hour
,minute
,second
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 0
},
"mappings": {
"books": {
"properties": {
"datetime": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss || yyyy-MM-dd || yyyy/MM/dd HH:mm:ss|| yyyy/MM/dd ||epoch_millis"
}
}
}
}
}
注:books 为自定义类型名字