最近项目中遇到一个需求。统计用户的搜索热词Top5。于是就想到了用elasticsearch来记录用户检索时的关键词及用户信息,然后通过聚合操作实现统计用户搜索热词,返回搜索次数最多的前10个。

Elasticsearch版本:7.0.0

首先创建存储关键词及用户信息的索引:

POST  http://localhost:9200/hotwords_test/_mapping


{
  "properties": {
    "search_txt": {
      "type": "keyword"
    },
    "user_name":{
		"type": "text",
		"analyzer": "keyword"
	},
	"happend_time":{
		"type": "date",
        "format": "yyy-MM-dd HH:mm:ss"
	}
  }
}

通过RestHighLevelClient 客户端,将测试数据插入索引,首先引入maven依赖:

<dependencies>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.0.0</version>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.48</version>
        </dependency>
    </dependencies>

测试数据索引入库代码:

import com.alibaba.fastjson.JSONObject;
import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.Aggregation;
import org.elasticsearch.search.aggregations.AggregationBuilder;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import java.io.IOException;

public class ElasticsearchTesl {
    public static final String host = "localhost";
    public static final Integer port = 9200;
    public static final String index = "hotwords_test";

    public static void main(String[] args) throws IOException{
        RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(
                new HttpHost(host, port, "http")));

        JSONObject data = new JSONObject();
        data.put("search_txt", "大枣");
        data.put("user_name", "test");
        data.put("happend_time", "2021-10-17 15:11:30");
        String docId = indexDoc(client, index, data);
        System.out.println(docId);
        client.close();

    }

    public static String indexDoc(RestHighLevelClient client, String index, JSONObject data){
        IndexRequest request = new IndexRequest(index);
        request.source(data);
        try {
            IndexResponse response = client.index(request, RequestOptions.DEFAULT);
            return response.getId();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }
}

执行多次,索引中已存在数据如下:

 

下面是聚合查询操作,查询出同一个用户,搜索各类水果的次数,并输出搜索次数最多的前5个。


AggregationBuilder aggregationBuilder = AggregationBuilders
                .terms("value_count").field("search_txt").size(5);
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        sourceBuilder.aggregation(aggregationBuilder);
        sourceBuilder.query(QueryBuilders.termQuery("user_name", "test"));
        SearchRequest searchRequest = new SearchRequest(index);
        searchRequest.source(sourceBuilder);
        SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
        Aggregations aggregations = searchResponse.getAggregations();
        for(Aggregation a:aggregations){
            Terms terms = (Terms) a;
            for(Terms.Bucket bucket:terms.getBuckets()){
                System.out.println(bucket.getKeyAsString() +":" + bucket.getDocCount());
            }
        }

 控制台输出如下:

甘蔗:4
芒果:4
榴莲:3
大枣:2
桃子:2

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐