问题1:通过restAPI导入数据在数据量非常小的情况下也可以使用,一次性导入100万条数据,这个可能要很长时间,想要在短时间内导入大量数据。

https://www.elastic.co/guide/cn/elasticsearch/guide/current/bulk.html#bulk
这篇文章简单介绍了批量导入数据的操作

我的改进
1、首先生成导入的数据文件,文件是json的格式,最后一定要多一行回车

{"index":{"_index":"test2","_type":"_doc"}}
{"name" : "java test 0", "id" : 0, "datetime" : "2021-08-03 05:35:50", "type": "pink", "make": "jiangxi"}
{"index":{"_index":"test2","_type":"_doc"}}
{"name" : "java test 1", "id" : 1, "datetime" : "2021-08-03 05:35:50", "type": "pink", "make": "wuhan"}
{"index":{"_index":"test2","_type":"_doc"}}
{"name" : "java test 2", "id" : 2, "datetime" : "2021-08-03 05:35:50", "type": "red", "make": "wuhan"}

_index:索引、_type:类型(es默认_doc),下面是要插入的数据,一个数据文件的大小控制在10M左右。

public static void main(String[] args) throws Exception {
        String[] type = {"red", "pink", "blue", "green"};
        String[] make = {"wuhan", "jiangxi", "hubei"};
        SimpleDateFormat ft = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
        String filePath;
        File file;
        FileOutputStream out = null;
        int count = 1;
        for (int i = 0; i < 10000000; i++) {
            if (i % 100000 == 0) {
                if (out != null) {
                    out.close();
                }
                filePath = "D:\\idea\\demo\\src\\main\\resources\\es\\test"+count+".json";
                file = new File(filePath);
                out = new FileOutputStream(file, false);
                count++;
            }
            Date dNow = new Date();
            String str = "{\"index\":{\"_index\":\"test2\",\"_type\":\"_doc\"}}\n" +
                    "{\"name\" : \"java test " + i + "\", \"id\" : " + i + ", \"datetime\" : \"" + ft.format(dNow) + "\", \"type\": \"" + type[new Random().nextInt(4)] + "\", \"make\": \"" + make[new Random().nextInt(3)] + "\"}\n";
            out.write(str.getBytes(Charset.forName("utf-8")));
        }
        out.close();
    }

以上代码是每个文件有10万条数据

2、可以通过curl -XPUT “localhost:9200/_bulk” -H “Content-Type: application/json” --data-binary @test1.json执行批量导入

3、以shell脚本同时执行多个文件,

#!/bin/bash
int=0
while(($int<100))
do
	let "int++"
	echo test"$int".json
	curl -XPUT "localhost:9200/_bulk" -H "Content-Type: application/json" --data-binary @test"$int".json
done
Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐