Elasticsearch学习系列【2】- 批量获取与批量写入文档
对文档的批量写操作是通过_bulk的API来实现的请求方式:POST请求地址:_bulk请求参数:通过_bulk操作文档,一般至少有两行参数(或偶数行参数)第一行参数为指定操作的类型及操作的对象(index,type,id)第二行参数才是操作的数据。...
前面学习了简单的crud,今天继续学习如何批量获取和批量写入文档
1.批量获取
ES的批量查询可以使用mget和msearch两种。其中mget是需要知道它的id,可以指定不同的index,也可以指定返回的source。msearch可以通过字段查询来进行一个批量的查找。
1.1 mget的方式
这里的批量获取文档,不是单纯的指获取多条数据,而是我们可以把多个查询条件放在一个请求里面去获取数据,这样说可能有些抽象,看下实例可能会更加清楚,下面就是去获取3个id的数据,并且每个ID需要查询的字段是不相同的,当然可能index会都不同,之前讲到的查询方式就没法满足这个需求。
如果我们查询的索引相同,类型相同,也可以简化写法,比如下面的示例可以简化成GET /user/_mget
当然,如果我们的查询内容也一样的话,那么还可以继续简化:
GET /user/_mget
{
"ids":["p2LBloIBJl3IJPYz41Fo","ZjdoloIBJl3IJPYzxoOT","NpOGloIBJl3IJPYzbCCX"]
}
GET /_mget
{
"docs":[
{
"_index":"user",
"_id":"p2LBloIBJl3IJPYz41Fo",
"_source":"age"
},
{
"_index":"user",
"_id":"ZjdoloIBJl3IJPYzxoOT",
"_source":["age","name"]
} ,
{
"_index":"user",
"_id":"NpOGloIBJl3IJPYzbCCX",
"_source":["age","height"]
}
]
}
查询结果如下
{
"docs" : [
{
"_index" : "user",
"_type" : "type1",
"_id" : "p2LBloIBJl3IJPYz41Fo",
"_version" : 1,
"_seq_no" : 24167018,
"_primary_term" : 1,
"found" : true,
"_source" : {
"age" : 54
}
},
{
"_index" : "user",
"_type" : "type1",
"_id" : "ZjdoloIBJl3IJPYzxoOT",
"_version" : 1,
"_seq_no" : 17640047,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "张三",
"age" : 72
}
},
{
"_index" : "user",
"_type" : "type1",
"_id" : "NpOGloIBJl3IJPYzbCCX",
"_version" : 1,
"_seq_no" : 19641070,
"_primary_term" : 1,
"found" : true,
"_source" : {
"age" : 18,
"height" : 189.6
}
}
]
}
1.2.msearch的方式
msearch的语法是这样的GET /_msearch
,查询一条数据它需要两个对象,第一个设置index和type,第二个设置查询语句。查询语句和search相同。如果只是查询一个index,我们可以在url中带上index,这样,如果查该index可以直接使用空对象表示,如下示例:
GET /user/_msearch
{} // 第一个查询的第一个参数,因为url中写了索引,这里可以省略,就用url中的index
{"query":{"match":{"name":"张三"}},"from":0,"size":1} // 第一个查询的第二个参数
{"index":"dbl_test_001"} // 第二个查询的第一个参数,索引和url不一致,可以单独指定
{"query":{"match_all":{}},"from":0,"size":1} // 第二个查询的第二个参数
返回结果格式如下
{
"took" : 11,
"responses" : [
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 16.08599,
"hits" : [
{
"_index" : "user",
"_type" : "type1",
"_id" : "p2LBloIBJl3IJPYz41Fo",
"_score" : 16.08599,
"_source" : {
"age" : 54,
"birthday" : "1967-11-29",
"height" : 159.7,
"hobbys" : [
"单机游戏",
"听音乐",
"跑步",
"绘画"
],
"name" : "张三",
"nativePlace" : "天津市",
"phone" : "13658233947",
"school" : "华东师范大学",
"sex" : 0,
"weight" : 101.26
}
}
]
},
"status" : 200
},
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "dbl_test_001",
"_type" : "type",
"_id" : "2jxrcYIBfwERNF3v7l3i",
"_score" : 1.0,
"_source" : {
"name" : "test",
"value" : "测试个锤锤"
}
}
]
},
"status" : 200
}
]
}
2. 批量写入
2.1 指令简单介绍
对文档的批量写操作是通过_bulk的API来实现的
-
请求方式:POST
-
请求地址:_bulk
-
请求参数:通过_bulk操作文档,一般至少有两行参数(或偶数行参数)
第一行参数为指定操作的类型及操作的对象(index,type,id)
第二行参数才是操作的数据
示例如下,我们增加id为1和2的两条数据:
POST _bulk
{"create":{"_index":"user","_type":"type1","_id":"1"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三666","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}
{"create":{"_index":"user","_type":"type1","_id":"2"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三777","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}
可以看到执行结果如下,结果中为我们的每一个操作都返回了一个执行结果,如果成功,status = 201
{
"took" : 6,
"errors" : false,
"items" : [
{
"create" : {
"_index" : "user",
"_type" : "type1",
"_id" : "1",
"_version" : 3,
"result" : "created",
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 29134592,
"_primary_term" : 1,
"status" : 201
}
},
{
"create" : {
"_index" : "user",
"_type" : "type1",
"_id" : "2",
"_version" : 3,
"result" : "created",
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 29137551,
"_primary_term" : 1,
"status" : 201
}
}
]
}
2.2 操作类型介绍
操作类型是指的第一个参数中的key,主要有create,index,delete,update,这些操作类型在一个_bulk请求中可以混用。
2.2.1 批量创建文档create
在上面的示例中我们就是使用的create去创建了文档
POST _bulk
{"create":{"_index":"user","_type":"type1","_id":"1"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三666","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}
{"create":{"_index":"user","_type":"type1","_id":"2"}}
{"age":54,"birthday":"1967-11-29","height":159.7,"hobbys":["单机游戏","听音乐","跑步","绘画"],"name":"张三777","nativePlace":"天津市","phone":"13658233947","school":"华东师范大学","sex":0,"weight":101.26}
重复创建会抛异常
2.2.2 普通创建或全量替换用index
如果ID原来已经存在,会更新原来ID的数据
如果ID不存在,就会新增一条数据
2.2.3 批量删除delete
批量删除时,就不用每个操作再写两个参数,只需要写第一个参数就行
POST _bulk
{"delete":{"_index":"user","_type":"type1","_id":"1"}}
{"delete":{"_index":"user","_type":"type1","_id":"2"}}
2.2.4 批量更新update
批量更新和新增的请求结构是一样的,第二个参数可以参考前面的局部更新,
POST _bulk
{"update":{"_index":"user","_type":"type1","_id":"5"}}
{"doc":{"age":33}}
{"update":{"_index":"user","_type":"type1","_id":"6"}}
{"doc":{"age":66}}
更多推荐
所有评论(0)