ES 除了DSL查询aggregations也是很重要的,如果说DSL相当于sql aggregations就相当于group by 和一些sum count 函数

aggregations能作什么

aggregations主要分三个部分,分桶聚合,指标聚合,管道聚合,aggregations在写查询语句的时候可以简写成aggs

        分桶聚合(bucket):相当于sql的group by 能按照某一个或多个条件,对数据进行分桶(分组),默认返回数据的count(计数)条数,但实际上,可以理解为数据被分桶了,方便后面的聚合或者统计操作,后面会有实际的例子。

        指标聚合(metrice):就是对分桶,或者未分桶的数据进行计算,例如avg求平均值,MAX最大值,min最小值,value count 计数 cardinality 基数 去重 starts 统计聚合等

        管道聚合(pipeline):基于聚合结果的查询,分桶有可能是多层的,也有可能和指标是仙桃的,管道聚合可以根据路径(分桶和指标聚合时候的命名路径)对数据进行有针对性的操作,例如排序。

aggregations如何使用

 es先插入数据

PUT /product/_doc/1
{
    "name" : "小米手机",
    "desc" :  "手机中的战斗机",
    "price" :  3999,
    "lv":"旗舰机",
    "type":"手机",
    "createtime":"2020-10-01T08:00:00Z",
    "tags": [ "性价比", "发烧", "不卡顿" ]
}
PUT /product/_doc/2
{
    "name" : "小米NFC手机",
    "desc" :  "支持全功能NFC,手机中的滑翔机",
    "price" :  4999,
        "lv":"旗舰机",
    "type":"手机",
    "createtime":"2020-05-21T08:00:00Z",
    "tags": [ "性价比", "发烧", "公交卡" ]
}
PUT /product/_doc/3
{
    "name" : "NFC手机",
    "desc" :  "手机中的轰炸机",
    "price" :  2999,
        "lv":"高端机",
    "type":"手机",
    "createtime":"2020-06-20",
    "tags": [ "性价比", "快充", "门禁卡" ]
}
PUT /product/_doc/4
{
    "name" : "小米耳机",
    "desc" :  "耳机中的黄焖鸡",
    "price" :  999,
        "lv":"百元机",
    "type":"耳机",
    "createtime":"2020-06-23",
    "tags": [ "降噪", "防水", "蓝牙" ]
}
PUT /product/_doc/5
{
    "name" : "红米耳机",
    "desc" :  "耳机中的肯德基",
    "price" :  399,
    "type":"耳机",
        "lv":"百元机",
    "createtime":"2020-07-20",
    "tags": [ "防火", "低音炮", "听声辨位" ]
}
PUT /product/_doc/6
{
    "name" : "小米手机12",
    "desc" :  "充电贼快掉电更快,超级无敌望远镜,高刷电竞屏",
    "price" :  5999,
        "lv":"旗舰机",
    "type":"手机",
    "createtime":"2020-07-27",
    "tags": [ "120HZ刷新率", "120W快充", "120倍变焦" ]
}
PUT /product/_doc/7
{
    "name" : "挨炮 SE2",
    "desc" :  "除了CPU,一无是处",
    "price" :  3299,
        "lv":"旗舰机",
    "type":"手机",
    "createtime":"2020-07-21",
    "tags": [ "割韭菜", "割韭菜", "割新韭菜" ]
}
PUT /product/_doc/8
{
    "name" : "XS Max",
    "desc" :  "听说要出新款15手机了,终于可以换掉手中的4S了",
    "price" :  4399,
        "lv":"旗舰机",
    "type":"手机",
    "createtime":"2020-08-19",
    "tags": [ "5V1A", "4G全网通", "大" ]
}
PUT /product/_doc/9
{
    "name" : "小米电视",
    "desc" :  "70寸性价比只选,不要一万八,要不要八千八,只要两千九百九十八",
    "price" :  2998,
        "lv":"高端机",
    "type":"电视",
    "createtime":"2020-08-16",
    "tags": [ "巨馍", "家庭影院", "游戏" ]
}
PUT /product/_doc/10
{
    "name" : "红米电视",
    "desc" :  "我比上边那个更划算,我也2998,我也70寸,但是我更好看",
    "price" :  2999,
    "type":"电视",
        "lv":"高端机",
    "createtime":"2020-08-28",
    "tags": [ "大片", "蓝光8K", "超薄" ]
}
PUT /product/_doc/11
{
  "name": "红米电视",
  "desc": "我比上边那个更划算,我也2998,我也70寸,但是我更好看",
  "price": "2998",
  "type": "电视",
  "lv": "高端机",
  "createtime": "2020-08-28",
  "tags": [
    "大片",
    "蓝光8K",
    "超薄"
  ]
}

分桶聚合简单例子

	GET product/_search
	{
	  "size": 0, 
	 "aggs": {
	   "tagtest": {//操作的命名
	     "terms": {//分桶方式
	       "field": "tags.keyword",//按tags 进行分桶 keyword 代表不分词 直接取数据
	       "size": 10
	     }
	   }
	 }
	}

指标聚合简单例子

	GET product/_search
	{
	  "size": 0,
	  "aggs": {
	    "max": {
	      "max": {
	        "field": "price"
	      }
	    },
	    "min": {
	      "min": {
	        "field": "price"
	      }
	    },
	    "avg": {
	      "avg": {
	        "field": "price"
	      }
	    }
	  }
	}
	GET product/_search
	{
	  "size": 0,
	  "aggs": {
	    "price_stats": {
	      "stats": {
	        "field": "price"
	      }
	    }
	  }
}

管道聚合例子

取按type分组后,进行avg平均值计算后,所有数据的最小值

嵌套聚合

根据 商品的 type,和lv(级别)进行分桶(嵌套),利用avg函数对价格计算平均值,利用管道查出分桶平均后的最小值

		GET product/_search
		{
		  "size": 0,
		  "aggs": {
		    "type_lv": {
		      "terms": {
		        "field": "type.keyword"
		      },
		      "aggs": {
		        "lv": {
		          "terms": {
		            "field": "lv.keyword"
		          },
		          "aggs": {
		            "price_avg": {
		              "avg": {
		                "field": "price"
		              }
		            }
		          }
		        },
		        "price_min": {
		          "min_bucket": {
		            "buckets_path": "lv>price_avg"
		          }
		        }
		      }
		    }
		  }
		}

基于查询结果的聚合

1 可以再aggs同级下添加 查询或筛选条件,对分桶的数据进行条件限制

例如添加条件筛选,按标签分桶,限制价格区间

		GET product/_search
		{
		  "query": {
		    "range": {
		      "price": {
		        "gte": 2000,
		        "lte": 6000
		      } 
		    }
		  }, 
		  "aggs": {
		    "type_lv": {
		      "terms": {
		        "field": "type.keyword"
		      }
		    }
		  }
		}
		

用过滤器对进行过滤然后分桶

		GET product/_search
		{
		  "query": {
		    "bool": {
		      "filter": [
		        {
		          "range": {
		            "price": {
		              "gte": 10,
		              "lte": 2000
		            }
		          }
		          
		        }
		      ]
		      
		    }
		  },
		  "aggs": {
		    "type_lv": {
		      "terms": {
		        "field": "type.keyword"
		      }
		    }
		  }
}

基于聚合结果的查询(分桶后对分桶后的数据进行筛选查询)

对分桶的部分结果,取消查询或筛选条件的限制

Global 阻断 上面的查询条件

如果多维度统计 有些需要过滤之后统计,有些不需要

		GET product/_search
		{
		  "size": 0,
		  "query": {"range": {
		    "price": {
		      "gte": 1000
		      }
		  }}, 
		  "aggs": {
		    "max": {
		      "max": {
		        "field": "price"
		      }
		    },
		    "min": {
		      "min": {
		        "field": "price"
		      }
		    },
		    "avg": {
		      "global": {}, 
		      "aggs": {
		        "price_avg": {
		          "avg": {
		            "field": "price"
		          }
		        }
		      }
		    }
		  }
		}

不同的指标聚合 有的根据筛选聚合 有的全量数据聚合 

		GET product/_search
		{
		  "size": 0,
		  "aggs": {
		    "max": {
		      "max": {
		        "field": "price"
		      }
		    },
		    "min": {
		      "min": {
		        "field": "price"
		      }
		    },
		    "avg": {
		      "filter": {
		        "range": {
		          "price": {
		            "gte": 1000
		          }
		        }
		      },
		      "aggs": {
		        "price_avg": {
		          "avg": {
		            "field": "price"
		          }
		        }
		      }
		    }
		  }
}

基于聚合的排序

按照count排序

		GET product/_search
		{
		  "size": 0,
		  "aggs": {
		    "tag_bucket":{
		      "terms": {
		        "field": "tags.keyword",
		        "size": 10,
		        "order": {
		          "_count": "asc"
		        }
		      }
		    }
		     
		  }
}

多级排序

		GET product/_search
		{
		  "size": 0,
		  "aggs": {
		    "type_first_order":{
		      "terms": {
		        "field": "type.keyword",
		        "order": {
		          "_term": "asc"
		        }
		      },
		    "aggs": {
		      "lv_second_order": {
		        "terms": {
		          "field": "lv.keyword",
		           "order": {
		            "_key": "asc"
		          }
		        }
		      }
		    }
		    }
		     
		  }
}

多级聚合

		GET product/_search
		{
		  "size": 0,
		  "aggs": {
		    "type_stats_price": {
		      "terms": {
		        "field": "type.keyword",
		        "order": {
		          "aggs_price>stats.sum": "asc"
		        }
		      },
		      "aggs": {
		        "aggs_price": {
		          "filter": {
		            "terms": {
		              "type.keyword": ["耳机","手机","电视"]
		            }
		          },
		          "aggs": {
		            "stats": {
		              "stats": {
		                "field": "price"
		              }
		            }
		          }
		        }
		      }
		    }
		  }
		}

Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐