前面学习了简单的crud,今天继续学习如何批量获取和批量写入文档

1.批量获取

ES的批量查询可以使用mget和msearch两种。其中mget是需要知道它的id,可以指定不同的index,也可以指定返回的source。msearch可以通过字段查询来进行一个批量的查找。

1.1 mget的方式

这里的批量获取文档,不是单纯的指获取多条数据,而是我们可以把多个查询条件放在一个请求里面去获取数据,这样说可能有些抽象,看下实例可能会更加清楚,下面就是去获取3个id的数据,并且每个ID需要查询的字段是不相同的,当然可能index会都不同,之前讲到的查询方式就没法满足这个需求。
如果我们查询的索引相同,类型相同,也可以简化写法,比如下面的示例可以简化成GET /user/_mget
当然,如果我们的查询内容也一样的话,那么还可以继续简化:

GET /user/_mget
{
  "ids":["p2LBloIBJl3IJPYz41Fo","ZjdoloIBJl3IJPYzxoOT","NpOGloIBJl3IJPYzbCCX"]
}
GET /_mget
{
  "docs":[
    {
      "_index":"user",
      "_id":"p2LBloIBJl3IJPYz41Fo",
      "_source":"age"
    },
    {
      "_index":"user",
      "_id":"ZjdoloIBJl3IJPYzxoOT",
      "_source":["age","name"]
    }  ,
    {
      "_index":"user",
      "_id":"NpOGloIBJl3IJPYzbCCX",
      "_source":["age","height"]
    }  
  ]
}

查询结果如下

{
  "docs" : [
    {
      "_index" : "user",
      "_type" : "type1",
      "_id" : "p2LBloIBJl3IJPYz41Fo",
      "_version" : 1,
      "_seq_no" : 24167018,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "age" : 54
      }
    },
    {
      "_index" : "user",
      "_type" : "type1",
      "_id" : "ZjdoloIBJl3IJPYzxoOT",
      "_version" : 1,
      "_seq_no" : 17640047,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "张三",
        "age" : 72
      }
    },
    {
      "_index" : "user",
      "_type" : "type1",
      "_id" : "NpOGloIBJl3IJPYzbCCX",
      "_version" : 1,
      "_seq_no" : 19641070,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "age" : 18,
        "height" : 189.6
      }
    }
  ]
}

1.2.msearch的方式

msearch的语法是这样的GET /_msearch,查询一条数据它需要两个对象,第一个设置index和type,第二个设置查询语句。查询语句和search相同。如果只是查询一个index,我们可以在url中带上index,这样,如果查该index可以直接使用空对象表示,如下示例:

GET /user/_msearch
{} // 第一个查询的第一个参数,因为url中写了索引,这里可以省略,就用url中的index
{"query":{"match":{"name":"张三"}},"from":0,"size":1} // 第一个查询的第二个参数
{"index":"dbl_test_001"} // 第二个查询的第一个参数,索引和url不一致,可以单独指定
{"query":{"match_all":{}},"from":0,"size":1} // 第二个查询的第二个参数

返回结果格式如下

{
  "took" : 11,
  "responses" : [
    {
      "took" : 11,
      "timed_out" : false,
      "_shards" : {
        "total" : 3,
        "successful" : 3,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 10000,
          "relation" : "gte"
        },
        "max_score" : 16.08599,
        "hits" : [
          {
            "_index" : "user",
            "_type" : "type1",
            "_id" : "p2LBloIBJl3IJPYz41Fo",
            "_score" : 16.08599,
            "_source" : {
              "age" : 54,
              "birthday" : "1967-11-29",
              "height" : 159.7,
              "hobbys" : [
                "单机游戏",
                "听音乐",
                "跑步",
                "绘画"
              ],
              "name" : "张三",
              "nativePlace" : "天津市",
              "phone" : "13658233947",
              "school" : "华东师范大学",
              "sex" : 0,
              "weight" : 101.26
            }
          }
        ]
      },
      "status" : 200
    },
    {
      "took" : 6,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "dbl_test_001",
            "_type" : "type",
            "_id" : "2jxrcYIBfwERNF3v7l3i",
            "_score" : 1.0,
            "_source" : {
              "name" : "test",
              "value" : "测试个锤锤"
            }
          }
        ]
      },
      "status" : 200
    }
  ]
}


2. 批量写入

2.1 指令简单介绍

对文档的批量写操作是通过_bulk的API来实现的

  • 请求方式:POST

  • 请求地址:_bulk

  • 请求参数:通过_bulk操作文档,一般至少有两行参数(或偶数行参数)

    第一行参数为指定操作的类型及操作的对象(index,type,id)
    第二行参数才是操作的数据

示例如下,我们增加id为1和2的两条数据:

POST _bulk
{"create":{"_index":"user","_type":"type1","_id":"1"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三666","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}
{"create":{"_index":"user","_type":"type1","_id":"2"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三777","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}

可以看到执行结果如下,结果中为我们的每一个操作都返回了一个执行结果,如果成功,status = 201

{
  "took" : 6,
  "errors" : false,
  "items" : [
    {
      "create" : {
        "_index" : "user",
        "_type" : "type1",
        "_id" : "1",
        "_version" : 3,
        "result" : "created",
        "_shards" : {
          "total" : 1,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 29134592,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "user",
        "_type" : "type1",
        "_id" : "2",
        "_version" : 3,
        "result" : "created",
        "_shards" : {
          "total" : 1,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 29137551,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

2.2 操作类型介绍

操作类型是指的第一个参数中的key,主要有create,index,delete,update,这些操作类型在一个_bulk请求中可以混用。

2.2.1 批量创建文档create

在上面的示例中我们就是使用的create去创建了文档

POST _bulk
{"create":{"_index":"user","_type":"type1","_id":"1"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三666","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}
{"create":{"_index":"user","_type":"type1","_id":"2"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三777","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}

重复创建会抛异常
在这里插入图片描述

2.2.2 普通创建或全量替换用index

如果ID原来已经存在,会更新原来ID的数据
在这里插入图片描述

如果ID不存在,就会新增一条数据
在这里插入图片描述

2.2.3 批量删除delete

批量删除时,就不用每个操作再写两个参数,只需要写第一个参数就行

POST _bulk
{"delete":{"_index":"user","_type":"type1","_id":"1"}}
{"delete":{"_index":"user","_type":"type1","_id":"2"}}

在这里插入图片描述

2.2.4 批量更新update

批量更新和新增的请求结构是一样的,第二个参数可以参考前面的局部更新,

POST _bulk
{"update":{"_index":"user","_type":"type1","_id":"5"}}
{"doc":{"age":33}}
{"update":{"_index":"user","_type":"type1","_id":"6"}}
{"doc":{"age":66}}

在这里插入图片描述

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐