ES的Query的DSL语法&Filter DSL&聚合分析

1 查询所有(match_all query)GET /lagou-company-index/_search{“query”:{“match_all”: {}}}query ：代表查询对象match_all ：代表查询所有结果took：查询花费时间，单位是毫秒time_out：是否超时_shards：分片信息hits：搜索结果总览对象total：搜索到的总条数max_score：所有结果中文档得

少一点Bug

3257人浏览 · 2022-04-10 22:12:18

少一点Bug · 2022-04-10 22:12:18 发布

1 查询所有(match_all query)
GET /lagou-company-index/_search
{
“query”:{
“match_all”: {}
}
}
query ：代表查询对象
match_all ：代表查询所有
结果
took：查询花费时间，单位是毫秒
time_out：是否超时
_shards：分片信息
hits：搜索结果总览对象
total：搜索到的总条数
max_score：所有结果中文档得分的最高分
hits：搜索结果的文档对象数组，每个元素是一条搜索到的文档信息
_index：索引库
_type：文档类型
_id：文档id
_score：文档得分
_source：文档的源数据
2. 全文搜索(full-text query)
全文搜索能够搜索已分析的文本字段，如电子邮件正文，商品描述等。使用索引期间应用于字段的同一分析器处理查询字符串
2.1 匹配搜索(match query)
全文查询的标准查询，它可以对一个字段进行模糊、短语查询。 match queries 接收text/numerics/dates, 对它们进行分词分析, 再组织成一个boolean查询。可通过operator 指定bool组合操作（or、and 默认是 or ）
2.1.1 or关系：
match 类型查询，会把查询条件进行分词，然后进行查询,多个词条之间是or的关系
GET /lagou-property/_search
{
“query”: {
“match”: {
“title”: “小米电视4A”
}
}
}
2.1.2 and关系
GET /lagou-property/_search
{
“query”: {
“match”: {
“title”: {
“query”: “小米电视4A”,
“operator”: “and”
}
}
}
}
2.2 短语搜索(match phrase query)
match_phrase 查询用来对一个字段进行短语查询，可以指定 analyzer、slop移动因子
GET /lagou-property/_search
{
“query”: {
“match_phrase”: {
“title”: “小米电视”
}
}
}
// 表示小米 4A中间移动一个单元格；小米电视4A把电视当做一个单元格移动
GET /lagou-property/_search
{
“query”: {
“match_phrase”: {
“title”:{
“query”: “小米 4A”,
“slop”: 1
}
}
}
}
2.3 query_string 查询
Query String Query提供了无需指定某字段而对文档全文进行匹配查询的一个高级查询,同时可以指定在哪些字段上进行匹配
//所有字段
GET /lagou-property/_search
{
“query”: {
“query_string”: {
“query”: 4288
}
}
}
//指定字段
GET /lagou-property/_search
{
“query”: {
“query_string”: {
“query”: 4288,
“default_field”: “price”

}
}
//逻辑查询
GET /lagou-property/_search
{
“query”: {
“query_string”: {
“query”: “手机 OR 小米”,
“default_field”: “title”
}
}
}
GET /lagou-property/_search
{
“query”: {
“query_string”: {
“query”: “手机 AND 小米”,
“default_field”: “title”
}
}
}
//模糊查询
GET /lagou-property/_search
{
“query”: {
“query_string”: {
“query”: “大米~1”,
“default_field”: “title”
}
}
}
//多字段支持
GET /lagou-property/_search
{
“query”: {
“multi_match”: {
“query”: 5699,
“fields”: [“title”,“price”]
}
}
}
2.4 多字段匹配搜索(multi match query)
GET /lagou-property/_search
{
“query”: {
“multi_match”: {
“query”: 5699,
“fields”: [“title”,“price”]
}
}
}
GET /lagou-property/_search
{
“query”: {
“multi_match”: {
“query”: “http://image.lagou.com/12479622.jpg”,
“fields”: [
“title”,
“ima*”
]
}
}
}
3. 词条级搜索(term-level queries)
. 可以使用term-level queries根据结构化数据中的精确值查找文档。结构化数据的值包括日期范围、IP地址、价格或产品ID。与全文查询不同，term-level queries不分析搜索词。相反，词条与存储在字段级别中的术语完全匹配
3.1 词条搜索(term query)
term 查询用于查询指定字段包含某个词项的文档
GET /book/_search
{
“query”: {
“term”: {
“name”: “solr”
}
}
}
3.2 词条集合搜索(terms query)
terms 查询用于查询指定字段包含某些词项的文档
GET /book/_search
{
“query”: {
“terms”: {“name”: [“solr”, “elasticsearch”]}
}
}
3.3 范围搜索(range query)
gte：大于等于
gt：大于
lte：小于等于
lt：小于
boost：查询权重权重越大，评分越高，越先展示
GET /book/_search
{
“query”: {
“range”: {
“price”: {
“gte”: 10,
“lte”: 200,
“boost”: 3
}
}
}
}
GET /book/_search
{
“query”: {
“range”: {
“timestamp”: {
“gte”: “now-2d/d”,
“lte”: “now/d”
}
}
}
}
GET /book/_search
{
“query”: {
“range”: {
“timestamp”: {
“gte”: “18/08/2020”,
“lte”: “2021”,
“format”: “dd/MM/yyyy||yyyy”
}
}
}
}
3.4 不为空搜索(exists query)查询指定字段值不为空的文档。相当 SQL 中的 column is not null
GET /book/_search
{
“query”: {
“exists”: {
“field”: “price”
}
}
}
3.5 词项前缀搜索(prefix query)
GET /book/_search
{
“query”: {
“prefix”: {
“name”: “so”
}
}
}
3.6 通配符搜索(wildcard query)
GET /book/_search
{
“query”: {
“wildcard”: {“name”: “sor"}
}
}
//指定前缀
GET /book/_search
{
“query”: {
“wildcard”: {
“name”: {
“value”: "lu”,
“boost”: 2
}
}
}
}
3.7 正则搜索(regexp query)
regexp允许使用正则表达式进行term查询.注意regexp如果使用不正确，会给服务器带来很严重的性能压力。比如.开头的查询，将会匹配所有的倒排索引中的关键字，这几乎相当于全表扫描，会很慢。因此如果可以的话，最好在使用正则前，加上匹配的前缀。
GET /book/_search
{
“query”: {
“regexp”:{“name”: "s."}
}
}
//加权重
GET /book/_search
{
“query”: {
“regexp”:{
“name”: {
“value”: “s.*”,
“boost”:3
}
}
}
}
3.8 模糊搜索(fuzzy query)
GET /book/_search
{
“query”: {
“fuzzy”: {
“name”:“so”
}
}
}
//fuzziness 偏移位置
GET /book/_search
{
“query”: {
“fuzzy”: {
“name”: {
“value”: “so”,
“boost”: 2,
“fuzziness”: 2
}
}
}
}
3.9 ids搜索(id集合查询)
GET /book/_search
{
“query”: {
“ids”: {
“type”: “_doc”,
“values”: [“1”,“3”]
}
}
}
4 复合搜索(compound query)
4.1 constant_score query 用来包装另一个查询，将查询匹配的文档的评分设为一个常值
GET /book/_search
{
“query”: {
“term”: {“description”:“solr”}
}
}
GET /book/_search
{
“query”: {
“constant_score”: {
“filter”: {
“term”: {
“description”: “solr”
}
},
“boost”: 1.2
}
}
}
4.2 布尔搜索(bool query)
bool 查询用bool操作来组合多个查询字句为一个查询。可用的关键字：
must：必须满足
filter：必须满足，但执行的是filter上下文，不参与、不影响评分
should：或
must_not：必须不满足，在filter上下文中执行，不参与、不影响评分
minimum_should_match代表了最小匹配精度，如果设置minimum_should_match=1，那么should语句中至少需要有一个条件满足。
GET /book/_search
{
“query”: {
“bool”: {
“must”: {
“match”: {
“description”: “java”
}
},
“filter”: {
“term”: {
“name”: “solr”
}
},
“must_not”: {
“range”: {
“price”: {
“gte”: 200,
“lte”: 300
}
}
},
“minimum_should_match”:1,
“boost”: 1
}
}
}

GET /book/_search
{
“query”: {
“bool”: {
“should”: {
“match”: {
“description”: “java”
}
},
“filter”: {
“term”: {
“name”: “solr”
}
},
“must_not”: {
“range”: {
“price”: {
“gte”: 200,
“lte”: 300
}
}
},
“minimum_should_match”:1,
“boost”: 1
}
}
}
5 排序
5.1 相关性评分排序
GET /book/_search
{
“query”: {
“match”: {
“description”: “solr”
}
}
}
GET /book/_search
{
“query”: {
“match”: {“description”:“solr”}
},
“sort”: [
{“_score”: {“order”: “asc”}}
]
}
5.2 字段值排序
GET /book/_search
{
“query”: {
“match_all”: {}
},
“sort”: [
{“price”: {“order”: “desc”}}
]
}
5.3 多级排序
GET /book/_search
{
“query”:{
“match_all”:{}
},
“sort”: [
{ “price”: { “order”: “desc” }},
{ “timestamp”: { “order”: “desc” }}
]
}
6 分页
size:每页显示多少条
from:当前页起始索引, int start = (pageNum - 1) * size
GET /book/_search
{
“query”: {
“match_all”: {}
},
“size”: 2,
“from”: 0
}
GET /book/_search
{
“query”: {
“match_all”: {}
},
“sort”: [
{“price”: {“order”: “desc”}}
],
“size”: 2,
“from”: 2
}
7 高亮
在使用match查询的同时，加上一个highlight属性：
pre_tags：前置标签
post_tags：后置标签
fields：需要高亮的字段
name：这里声明title字段需要高亮，后面可以为这个字段设置特有配置，也可以空
GET /book/_search
{
“query”: {
“match”: {
“name”: “elasticsearch”
}
},
“highlight”: {
“pre_tags”: “”,
“post_tags”: “”,
“fields”: [
{
“name”: {}
},
{
“description”: {}
}
]
}
}
//全文档
GET /book/_search
{
“query”: {
“query_string”: {
“query”: “elasticsearch”
}
},
“highlight”: {
“pre_tags”: “”,
“post_tags”: “”,
“fields”: [
{
“name”: {}
},
{
“description”: {}
}
]
}
}
10 文档批量操作（bulk 和 mget）
10.1 mget 批量查询
//不同索引
GET /_mget
{
“docs”: [
{
“_index”: “book”,
“_id”: 1
},
{
“_index”: “book”,
“_id”: 2
}
]
}
//同一索引
GET /_mget
{
“docs”: [
{
“_index”: “book”,
“_id”: 1
},
{
“_index”: “lagou-company-index”,
“_id”: 1
}
]
}
10.1.2 bulk 批量增删改
Bulk 操作解释将文档的增删改查一些列操作，通过一次请求全都做完。减少网络传输次数。
GET /_bulk
{ “delete”: { “_index”: “book”, “_id”: “1” }}
{ “create”: { “_index”: “book”, “_id”: “5” }}
{ “name”: “test14”,“price”:100.99 }
{ “update”: { “_index”: “book”, “_id”: “2”} }
{ “doc” : {“name” : “test”} }
11 Filter DSL
过滤器不会计算相关度的得分，所以它们在计算上更快一些,过滤器可以被缓存到内存中，这使得在重复的搜索查询上，其要比相应的查询快出许多.
GET /book/_search
{
“query”: {
“bool”: {
“must”: {
“match_all”:{

    }
  },
  "filter": {
    "range": {
      "price": {
        "gte": 200,
        "lte": 500
      }
    }
  }
}

}
}
12 定位非法搜索及原因
GET /book/_validate/query?explain
{
“query”: {
“match1”:{
“name:”:“test”
}
}
}
13 聚合分析
13.1 聚合介绍
对一个数据集求最大、最小、和、平均值等指标的聚合，在ES中称为指标聚合 metric
而关系型数据库中除了有聚合函数外，还可以对查询出的数据进行分组group by，再在组上进行指标聚
合。在 ES 中group by 称为分桶，桶聚合 bucketing
13.2 语法
GET /book/_search
{
“size”: 0,
“aggs”: {
“NAME”: {
“AGG_TYPE”: {}
}
}
}
13.3 求价格的最大值
GET /book/_search
{
“size”: 0,
“aggs”: {
“max_price”: {
“max”: {
“field”:“price”
}
}
}
}
13.4 求平均值
GET /book/_search
{
“size”: 0,
“aggs”: {
“avg_price”: {
“avg”: {
“field”:“price”
}
}
}
}
13.5 统计price大于100的文档数量
GET /book/_count
{
“query”: {
“range”: {
“price”: {
“gte”: 10

}
}

}
}
13.6 value_count 统计某字段有值的文档数
GET /book/_search
{
“size”: 0,
“aggs”: {
“price_count”: {
“value_count”: {
“field”:“price”
}
}
}
}
13.7 cardinality值去重计数基数
GET /book/_search
{
“size”: 0, //size 0表示不展示文档数据
“aggs”: {
“_id_count”: {
“cardinality”: {
“field”:“price”
}
},
“price_count”: {
“cardinality”: {
“field”: “price”
}
}

}
}
13.8 stats 统计 count max min avg sum 5个值
GET /book/_search
{
“size”: 0,
“aggs”: {
“price_stats”: {
“stats”: {
“field”:“price”
}
}
}
}
13.9 Extended stats 比stats多4个统计结果：平方和、方差、标准差、平均值加/减两个标准差的区间
GET /book/_search
{
“size”: 0,
“aggs”: {
“price_stats”: {
“extended_stats”: {
“field”:“price”
}
}
}
}
13.10 Percentiles 占比百分位对应的值统计
GET /book/_search
{
“size”: 0,
“aggs”: {
“price_percents”: {
“percentiles”: {
“field”:“price”
}
}
}
}

13.11 指定分位值

GET /book/_search
{
“size”: 0,
“aggs”: {
“price_percents”: {
“percentiles”: {
“field”:“price”,
“percents” : [75, 99, 99.9]
}
}
}
}
在这里插入图片描述
其中返回结果解释：“75.0” :810.2200012207031//表示小于等于810.2200012207031的占比
13.12 Percentiles rank 统计值小于等于指定值的文档占比
GET /book/_search
{
“size”: 0,
“aggs”: {
“gge_perc_rank”: {
“percentile_ranks”: {
“field”:“price”,
“values” : [100,200]
}
}
}
}
14 桶聚合
bucket：一个数据分组
metric：对一个数据分组执行的统计
GET /book/_search
{
“size”: 0,
“aggs”: {
“group_by_price”: {
“range”: {
“field”: “price”,
“ranges”: [
{
“from”: 0,
“to”: 200
},
{
“from”: 200,
“to”: 400
},
{
“from”: 400,
“to”: 1000
}
]
},
“aggs”:{
“average_price”:{
“avg”:{
“field”:“price”
}
}
}

}
}

GET /book/_search
{
“size”: 0,
“aggs”: {
“group_by_price”: {
“range”: {
“field”: “price”,
“ranges”: [
{
“from”: 0,
“to”: 200
},
{
“from”: 200,
“to”: 400
},
{
“from”: 400,
“to”: 1000
}
]
},
“aggs”:{
“average_price”:{
“avg”:{
“field”:“price”
}
},
“count_price”:{
“value_count”:{
“field”: “price”
}
},
“having”:{
“bucket_selector”:{
“buckets_path”:{
“avg_price”:“average_price”
},
“script”:{
“source”: "params.avg_price >= 200 "
}
}
}
}