本文只做springboot与es的集成和简单API操作,若想了解es的基础概念和其他知识点,请移步官方网站:开源搜索:Elasticsearch、ELK Stack 和 Kibana 的开发者 | Elastic

推荐一篇es学习的博客:狂神elasticsearch笔记(纯手敲)_哦到凯呀的博客-CSDN博客_狂神说elasticsearch笔记

通用的工具请自取:

SpringBoot基于ElasticSearch7.9.2和ElasticsearchRestTemplate的一些通用接口(可复用)_菜菜的小咸鱼的博客-CSDN博客 

另外,es的更新非常快,目前官网最新的版本已经到7.13了,每个版本的API可能都会有所改动。

本文基于SpringBoot 2.3.1.RELEASE,ES 7.9.2做演示。

开始吧!

首先,肯定要先安装es,kibana,基础的安装步骤、配置、分词器插件配置,这里就不在介绍,百度一搜一大把。

  • 一、pom引入spring-data-elasticsearch
<dependency>
     <groupId>org.springframework.boot</groupId>
     <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

注意:不同版本spring-data对应的es的默认配置版本不一样,如:我当前使用的spring-data 版本为4.0,其中默认的es版本为7.6.2,因此我们需要自己指定需要使用的es版本。

在我们项目的pom.xml里面指定es的版本:

  • 二、配置连接参数

1、yml配置文件(主要是为了方便修改),注意这里的缩进,因为没有使用spring自带的es配置,所以节点名称都可以自定义,同时父节点为一级节点。

elasticsearch:
  scheme: http
  host: localhost 
  port: 9200
  connection-request-timeout: 30000
  socket-timeout: 6000000
  connect-timeout: 5000000

如果使用spring的es配置,格式如下:

spring:
    data:
      elasticsearch:
        xxxx:zzzz
        xxxx:zzzz  

 2、新建配置文件读取配置客户端连接。

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;


@Configuration
public class ElasticSearchClientConfig {

    @Value("${elasticsearch.host}")
    private String host;

    @Value("${elasticsearch.port}")
    private Integer port;

    @Value("${elasticsearch.scheme}")
    private String scheme;

    @Value("${elasticsearch.connect-timeout}")
    private Integer connectTimeout;

    @Value("${elasticsearch.socket-timeout}")
    private Integer socketTimeout;

    @Bean
    @Qualifier("highLevelClient")
    public RestHighLevelClient restHighLevelClient() {
        // 该方法接收一个RequestConfig.Builder对象,对该对象进行修改后然后返回。
        RestHighLevelClient highLevelClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost(host, port, scheme))
                        .setRequestConfigCallback(requestConfigBuilder -> {
                            return requestConfigBuilder.setConnectTimeout(connectTimeout) // 连接超时(默认为1秒)
                                    .setSocketTimeout(socketTimeout);// 套接字超时(默认为30秒)//更改客户端的超时限制默认30秒现在改为100*1000分钟
                        }));// 调整最大重试超时时间(默认为30秒).setMaxRetryTimeoutMillis(60000);

        return highLevelClient;
    }
}

至此,springboot配置es就完成了,启动项目即可看到如下信息:因为我们设置了版本和springdata默认的版本一直,所以会警告版本不匹配,不用理会。

  • 三、新建一个实体类,并配置索引映射,这样我们可以直接用这个类去创建索引,并且配置索引字段的mapping

在创建实体类和映射规则前,有必要先了解一下elasticsearch的几个常用注解和枚举:

1、@Document,位于org.springframework.data.elasticsearch.annotations包下。

@Persistent
@Inherited
@Retention(RetentionPolicy.RUNTIME)
@Target({ ElementType.TYPE })
public @interface Document {
    
    /**
	 * Name of the Elasticsearch index.
	 * <ul>
	 * <li>Lowercase only</li>
	 * <li><Cannot include \, /, *, ?, ", <, >, |, ` ` (space character), ,, #/li>
	 * <li>Cannot start with -, _, +</li>
	 * <li>Cannot be . or ..</li>
	 * <li>Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will         
     * count towards the 255 limit
	 * faster)</li>
	 * </ul>
	 */
    String indexName(); //索引名称,好比MySQL的数据库名

    @Deprecated
	String type() default ""; //类型 ,当前版本已弃用
    
    /**
	 * Use server-side settings when creating the index.
     * 翻译过来就是:创建索引时使用服务器端设置。
     * 这里默认为false,如果改为true,表示在Spring创建索引时,Spring不会在创建的索引中设置以下设            
     * 置:shards、replicas、refreshInterval和indexStoreType。这些设置将是 Elasticsearch 默认 
     * 值(服务器配置),也就是说,我们自定义的某些配置将不会生效。
	 */
	boolean useServerConfiguration() default false;
    
    short shards() default 1; // 默认分区数
    
    short replicas() default 1;// 默认每个分区的备份数
    
    String refreshInterval() default "1s"; //索引默认的刷新时间间隔
    
    String indexStoreType() default "fs"; //索引文件存储类型

    /**
	 * Configuration whether to create an index on repository bootstrapping.
	 */
	boolean createIndex() default true; //当spring容器启动时,如果索引不存在,则自动创建索引
    
    VersionType versionType() default VersionType.EXTERNAL;//默认的版本管理类型
}

2、@Field,位于org.springframework.data.elasticsearch.annotations包下:这个注解的字段比较多,这里只列举几个比较常用的

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.FIELD)
@Documented
@Inherited
public @interface Field {
    /**
	 * Alias for {@link #name}.
	 *
	 * @since 3.2
	 */
	@AliasFor("name")
	String value() default "";

	/**
	 * The <em>name</em> to be used to store the field inside the document.
	 * <p>
	 * √5 If not set, the name of the annotated property is used.
	 *
	 * @since 3.2
	 */
	@AliasFor("value")
	String name() default "";
    //上面两个注解,可互为别名使用。
    //主要的作用就是指定我们创建索引时,当前字段在索引中的名称,如果不设置,它会默认使用实体类里    
    // 面使用了@Field这个注解的属性名作为索引字段名。例如:
    // class User{
    //     
    //     @Field(name = "name" * 或者:value = "name"* )
    //     private String userName;
    // }
    // 如上,如果设置了name(或value)值,那么索引里面对应userName的字段名就是“name”,否则就是            
    // “userName”

    FieldType type() default FieldType.Auto; //自动检测索引字段的数据类型,可以根据实际情况自己设置。
    
    DateFormat format() default DateFormat.none;//日期时间格式化,默认不做任何处理
    
    String searchAnalyzer() default "";//检索时的分词策略
    
    String analyzer() default "";//创建索引时指定分词策略
    
    .
    .
    .
    
}

3、@Field中提到的FieldType 枚举:

public enum FieldType {
    Auto, //根据内容自动判断
    Text, //索引全文字段,如产品描述。这些字段会被分词器进行分词处理。
    Keyword, //用于索引结构化内容(如电子邮件地址,主机名,状态码,邮政编码等)的字段。他们通常用于过滤、排序、聚合。关键字字段只能根据期确切的值进行搜索。标记为keyword的字段不会被分词器分词处理
    Long, //
    Integer, //
    Short, //
    Byte, //
    Double, //
    Float, //
    Half_Float, //
    Scaled_Float, //
    Date, //
    Date_Nanos, //
    Boolean, //
    Binary, //
    Integer_Range, //
    Float_Range, //
    Long_Range, //
    Double_Range, //
    Date_Range, //
    Ip_Range, //
    Object, //
    Nested, //
    Ip, //可以索引和存储IPV4和IPv6地址
    TokenCount, //
    Percolator, //
    Flattened, //
    Search_As_You_Type //
}

好了,了解了基础的注解之后,新建实体类并映射规则,为创建索引做准备:

@Data //lombok的data注解
@Document(indexName = "my_index")
public class EsSourceInfo  implements Serializable { 

    private static final long serialVersionUID = -4780769443664126870L;

    @Field(type = FieldType.Keyword) //当前字段不能被分词
    private String lngId;
    
    private String remark; //不加@Field注解,创建索引时会使用默认的Field设置
    
    /**
     * analyzer = "ik_smart" 创建索引使用的分词策略
     * type = FieldType.Text 字段类型为文本类型
     * searchAnalyzer = "ik_max_word" 检索时的分词策略
     */
    @Field(analyzer = "ik_smart",type = FieldType.Text,searchAnalyzer = "ik_max_word")
    private String discreption;

    /**
     * analyzer = "ik_max_word" 创建索引使用的分词策略
     * type = FieldType.Text 字段类型为文本类型
     * searchAnalyzer = "ik_smart" 检索时的分词策略
     */
    @Field(analyzer = "ik_max_word",type = FieldType.Text,searchAnalyzer = "ik_smart")
    private String address;

    /**
     * type = FieldType.Keyword字段类型为文本类型
     * 这里的keywordsArrays为后期用来做聚类的字段,不会被分词器分词
     */
    @Field(type = FieldType.Keyword)
    private String[] keywordsArrays;
}

上面关于ik分词器的分词策略解释:

 1、ik_max_word:会将文本做最细粒度的拆分,比如会将“中华人民共和国人民大会堂”拆分为“中华人民共和国、中华人民、中华、华人、人民共和国、人民、共和国、大会堂、大会、会堂等词语。

2、ik_smart:会做最粗粒度的拆分,比如会将“中华人民共和国人民大会堂”拆分为中华人民共和国、人民大会堂。

两种分词器使用的最佳实践是:索引时用ik_max_word,在搜索时用ik_smart

  • 四、索引操作
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.data.elasticsearch.core.IndexOperations;
import org.springframework.data.elasticsearch.core.mapping.IndexCoordinates;

import java.lang.annotation.Annotation;


public class IndexOperation {
    
    
    @Autowired
    private ElasticsearchRestTemplate restTemplate;
    
    /**
     * @Description 根据实体类创建索引,这里的实体类就是上面创建的EsSourceInfo实体
     * @Author Innocence
     */
    public Boolean createIndexByClass(Class clazz) {
        Annotation documentAnnotation = clazz.getDeclaredAnnotation(Document.class);
        if(documentAnnotation==null){
            return false;
        }
        String indexName = ((Document) documentAnnotation).indexName();
        Boolean indexExist = isIndexExist(indexName);
        if (indexExist){
            return false;
        }
        IndexOperations indexOps = restTemplate.indexOps(clazz);
        boolean result1 = indexOps.create(); //创建索引
        boolean result2 = indexOps.putMapping(indexOps.createMapping(clazz));/设置索引的映射规则,很重要!!!
        return result1&result2;
    }

    /**
     * @Description 根据索引名判断索引是否存在
     * @Author Innocence
     */
    public Boolean isIndexExist(String indexName) {
        IndexOperations indexOps = restTemplate.indexOps(IndexCoordinates.of(indexName));
        return indexOps.exists();
    }

    /**
     * @Description 根据索引名删除索引
     * @Author Innocence
     */
    public Boolean deleteIndexByName(String indexName) {
        IndexOperations indexOps = restTemplate.indexOps(IndexCoordinates.of(indexName));
        return indexOps.delete();

    }
}
  • 五、文档操作(重点:类似MySQL的CRUD操作)
import com.google.common.collect.Lists;
import org.apache.commons.beanutils.BeanUtils;
import org.apache.commons.beanutils.PropertyUtils;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.Aggregation;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.ParsedStringTerms;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.PageImpl;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.data.elasticsearch.core.SearchHit;
import org.springframework.data.elasticsearch.core.SearchHits;
import org.springframework.data.elasticsearch.core.document.Document;
import org.springframework.data.elasticsearch.core.mapping.IndexCoordinates;
import org.springframework.data.elasticsearch.core.query.*;

import java.io.IOException;
import java.lang.reflect.InvocationTargetException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class DocumentOperation {

    @Autowired
    private ElasticsearchRestTemplate restTemplate;

    @Autowired
    private RestHighLevelClient highLevelClient;

    /**
     * 映射高亮字段到原生属性,es的高亮查询不会直接把高亮字段映射到实体类,所以这里我们要自己处理
     * @author Innocence
     * @param searchHits
     * @return java.util.List<T>
     */
    private <T> List<T> mappingHighlight(List<SearchHit<T>> searchHits){
        List<T> infoList = Lists.newArrayList();
        for (SearchHit<T> searchHit : searchHits) {
            T content = searchHit.getContent();
            Map<String, List<String>> highlightFields = searchHit.getHighlightFields();
            for (Map.Entry<String, List<String>> entry : highlightFields.entrySet()) {
                try {
                        PropertyUtils.setProperty(content,entry.getKey(),entry.getValue().get(0));
                } catch (IllegalAccessException | InvocationTargetException | NoSuchMethodException e) {
                    e.printStackTrace();
                }
            }
            infoList.add(content);
        }
        return infoList;
    }
    
    /**
     * 设置高亮映射规则 
     * @author Innocence
     * @date 2021/6/9
     * @param [fields] 
     * @return org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder
     */
    public HighlightBuilder getHighlightBuilder(String[] fields) {
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        for (String field : fields) {
            highlightBuilder.field(field);
        }
        highlightBuilder.requireFieldMatch(false);     //如果要多个字段高亮,这项要为false
        highlightBuilder.preTags("<span style=\"color:red\">");
        highlightBuilder.postTags("</span>");
        //下面这两项,如果你要高亮如文字内容等有很多字的字段,必须配置,不然会导致高亮不全,文章内容缺失等
        highlightBuilder.fragmentSize(800000); //最大高亮分片数
        highlightBuilder.numOfFragments(0); //从第一个分片获取高亮片段

        return highlightBuilder;
    }

    /**
     * 根据id判断文档是否存在于指定索引中
     * @author Innocence
     * @param id  文档id,这里的id是需要我们在存储数据时指定
     * @param indexName    索引名
     * @return java.lang.Boolean
     */
    public Boolean isExist(String id,String indexName) {
        return restTemplate.exists(id, IndexCoordinates.of(indexName));
    }
    
    /**
     *  单条数据插入
     * @author Innocence
     * @param entity 待插入的数据实体
     * @param indexName 索引名
     * @return java.lang.String 返回文档id
     */
     public String saveByEntity(EsSourceInfo entity, String indexName) throws Exception{
        if (StrUtil.isBlank(entity.getLngId())){
            throw new Exception("文档id不能为空!");
        }
        IndexQuery build = new IndexQueryBuilder()
                .withId(entity.getLngId()) //这里的操作就是指定文档id
                .withObject(entity).build();
        return restTemplate.index(build,IndexCoordinates.of(indexName));
    }
    
    /**
     * 批量插入
     * @author Innocence
     * @param entities 待插入的数据实体集合
     * @param indexName 索引名
     * @return java.util.List<java.lang.String> 返回idList
     */
    public List<String> saveBatchByEntities(List<EsSourceInfo> entities, String indexName) throws Exception{
        List<IndexQuery> queryList = new ArrayList<>();
        for (EsSourceInfo item:entities){
            if (StrUtil.isBlank(item.getLngId())){
                throw new Exception("文档id不能为空!");
            }
            IndexQuery build = new IndexQueryBuilder().withId(item.getLngId()).withObject(item).build();
            queryList.add(build);
        }
        return restTemplate.bulkIndex(queryList, IndexCoordinates.of(indexName));
    }

    /**
     * 单条数据更新
     * @author Innocence
     * @param entity 待更新的数据实体
     * @param indexName 索引名
     * @return void
     */
    public void updateByEntity(EsSourceInfo entity, String indexName) {
        Map<String,Object> map = null;
        try {
            map = BeanUtils.describe(entity);
        } catch (Exception e) {
            e.printStackTrace();
        }
        Document document = Document.from(map);
        document.setId(entity.getLngId());
        // 这里的UpdateQuery需要构造一个Document对象,但是Document对象不能用实体类转化而来 
        //(可见Document的源码,位于:org.springframework.data.elasticsearch.core.document 
        // 包下),因此上面才会BeanUtils.describe(entity),将实体类转化成一个map,由map转化 
        // 为Document对象。
        UpdateQuery build = UpdateQuery.builder(entity.getLngId())
                .withDocAsUpsert(false) //不加默认false。true表示更新时不存在就插入
                .withDocument(document)
                .build();
        restTemplate.update(build, IndexCoordinates.of(indexName));
    }
    
    /**
     * 根据maps批量更新
     * @author Innocence
     * @param maps 待更新的数据实体集合
     * @param indexName 索引名
     * @return void
     */
    public void updateByMaps(List<Map<String, Object>> maps, String indexName) {
        List<UpdateQuery> updateQueries = new ArrayList<>();
        maps.forEach(item->{
            Document document = Document.from(item);
            document.setId(String.valueOf(item.get("lngid")));
            UpdateQuery build = UpdateQuery.builder(document.getId())
                    .withDocument(document)
                    .build();
            updateQueries.add(build);
        });
        restTemplate.bulkUpdate(updateQueries,IndexCoordinates.of(indexName));
    }

    /**
     * 根据id删除数据
     * @author Innocence
     * @param id
     * @param indexName 索引名
     * @return java.lang.String 被删除的id
     */
    public String deleteById(String id, String indexName) {
        return restTemplate.delete(id,IndexCoordinates.of(indexName));
    }
    
    /**
     * 根据id批量删除数据
     * @author Innocence
     * @param docIdName 文档id字段名,如我们上面设置的文档id的字段名为“lngId”
     * @param ids 需要删除的id集合
     * @param indexName  索引名称
     * @return void
     */
    public void deleteByIds(String docIdName , List<String> ids, String indexName) {
        StringQuery query = new StringQuery(QueryBuilders.termsQuery(docIdName, ids).toString());
        restTemplate.delete(query,null,IndexCoordinates.of(indexName));
    }
    
    /**
     * 根据条件删除数据
     * @author Innocence
     * @param query 条件构造器
     * @param clazz 数据对应实体类
     * @param indexName 索引名
     * @return void
     */
    public void deleteByQuery(Query query, Class<?> clazz, String indexName) {
        restTemplate.delete(query,clazz,IndexCoordinates.of(indexName));
    }
    
    /**
     * 根据id查询数据 (基于注解形式设置了索引mapping)
     * @param id
     */
    public EsSourceInfo getEntityById(String id) {
        return restTemplate.get(id,EsSourceInfo.class);
    }

    /**
     * 查询符合条件的总条数
     * @author Innocence
     * @return java.lang.Long
     */
    public Long getCount(Query query, Class clazz) {
        return restTemplate.count(query,clazz);
    }
    
    /**
     * 查询符合条件的实体list
     * @author Innocence
     * @param query 构建的查询条件 主要使用NativeSearchQuery 来进行构造
     * @return java.util.List<com.cqvip.innocence.project.model.entity.EsSourceInfo>
     */
    public  List<EsSourceInfo> getInfoList(Query query, String indexName) {
        // es本身默认限制了查找的量为10000条,官方文档的建议是不要修改,太大会影响性能和效率,
        // 建议使用 scroll 来代替。如果超出10000条,返回的结果最多只有10000条
        // 所以,我们在这里设置setTrackTotalHits(true),返回真实的命中条数
        query.setTrackTotalHits(true);
        SearchHits<EsSourceInfo> search = restTemplate.search(query, EsSourceInfo.class, IndexCoordinates.of(indexName));
        return mappingHighlight(search.getSearchHits());//映射的高亮字段
    }
    
    /**
      * 分页查询
      * @author Innocence
      * @param query 构建的查询条件 主要使用NativeSearchQuery 来进行构造
      * @return org.springframework.data.domain.Page<com.cqvip.innocence.project.model.entity.EsSourceInfo>
      */
    public Map<String, Object> getPageList(Query query, PageRequest pageRequest) {
        query.setTrackTotalHits(true);
        SearchHits<EsSourceInfo> search = restTemplate.search(query, EsSourceInfo.class);
        Aggregations aggregations = search.getAggregations();
        List<SearchHit<EsSourceInfo>> searchHits = search.getSearchHits();
        List<EsSourceInfo> esSourceInfos = mappingHighlight(searchHits);
        Page infos = new PageImpl(
                esSourceInfos,
                pageRequest,
                search.getTotalHits());
        Map<String, Object> map = new HashMap<>();
        map.put("page",infos);
        map.put("ag",formatFacet(aggregations));
        return map;
    }
    
    /**
      * 根据条件获取聚类信息 
      * springboot-data 封装的聚类查询速度很慢,这里直接用client操作
      * @author Innocence
      * @date 2021/3/19
      * @param query, indexName]
      * @return java.util.Map<java.lang.String,java.util.List<? extends org.elasticsearch.search.aggregations.bucket.terms.Terms.Bucket>>
      */
    public Map<String, List<? extends Terms.Bucket>> getFacetByQuery(SearchSourceBuilder query, String indexName) throws IOException {
        SearchRequest request = new SearchRequest(indexName);
        SearchSourceBuilder builder = query;
        request.source(builder);
        SearchResponse response = highLevelClient.search(request, RequestOptions.DEFAULT);
        Aggregations aggregations = response.getAggregations();
        return formatFacet(aggregations);
    }

    /**
     * 格式化聚类数据
     * @author Innocence
     * @param aggregations
     * @return java.util.Map<java.lang.String,java.util.List<java.util.Map<java.lang.String,java.lang.Object>>>
     */
    private Map<String, List<? extends Terms.Bucket>> formatFacet(Aggregations aggregations){
        if (aggregations == null){
            return null;
        }
        Map<String, List<? extends Terms.Bucket>> map = new HashMap<>();
        List<Aggregation> list = aggregations.asList();
        list.forEach(item->{
            ParsedStringTerms newItem = (ParsedStringTerms) item;
            String name = newItem.getName();
            List<? extends Terms.Bucket> buckets = newItem.getBuckets();
            map.put(name,buckets);
        });
        return map;
    }
}
  • 六、条件构造案例

上面的文档操作中,提到很多个Query条件构造器,其中IndexQuery和UpdateQuery上面的操作工具类里面就有构造方式和使用方法,主要还是关于查询的Query,很多同学可能会一脸蒙蔽,这里放一个我项目中使用的真实案列,具体的细节还要多看文档。

package com.cqvip.innocence.project.controller.front.search;

import cn.hutool.core.util.StrUtil;
import com.cqvip.innocence.common.annotation.SensitiveTag;
import com.cqvip.innocence.common.constant.EsIndexConstant;
import com.cqvip.innocence.project.esservice.DocumentService;
import com.cqvip.innocence.project.model.dto.JsonResult;
import com.cqvip.innocence.project.model.dto.SearchModel;
import com.cqvip.innocence.project.model.dto.SearchParams;
import com.cqvip.innocence.project.model.enums.ResourceType;
import com.cqvip.innocence.project.model.enums.SearchFiled;
import io.swagger.annotations.ApiOperation;
import org.elasticsearch.index.query.*;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.BucketOrder;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.sort.SortBuilder;
import org.elasticsearch.search.sort.SortBuilders;
import org.elasticsearch.search.sort.SortOrder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.elasticsearch.core.query.NativeSearchQuery;
import org.springframework.data.elasticsearch.core.query.NativeSearchQueryBuilder;
import org.springframework.web.bind.annotation.*;

import javax.servlet.http.HttpServletRequest;
import java.io.IOException;
import java.util.*;
import java.util.stream.Collectors;

/**
 * @ClassName CommonSearchController
 * @Description 前台通用的检索接口
 * @Author Innocence
 * @Date 2021/3/18 9:26
 * @Version 1.0
 */
@RestController
@RequestMapping("/search/")
public class CommonSearchController {

    @Autowired
    private DocumentService documentService;

    
    @GetMapping("getArticleFacetsToMedia")
    @ApiOperation("期刊详情页获取文献年期聚类")
    public JsonResult getArticleFacets(String gch) throws IOException {
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        boolQueryBuilder.must(QueryBuilders.matchQuery("workType",ResourceType.ARTICLE.getCode()))
                .must(QueryBuilders.termQuery("gch",gch));
        TermsAggregationBuilder yearsFacet = AggregationBuilders
                .terms("years")
                .field("years")
                .order(BucketOrder.key(false))
                .size(150);
        yearsFacet.subAggregation(AggregationBuilders
                .terms("num")
                .field("num")
                .order(BucketOrder.key(true))
                .size(150));
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.query(boolQueryBuilder).aggregation(yearsFacet);
        Map<String, List<? extends Terms.Bucket>> facetByQuery = documentService.getFacetByQuery(searchSourceBuilder, EsIndexConstant.INDEX_NAME);
        return JsonResult.Get().putRes(facetByQuery);
    }
    
    @SensitiveTag
    @PostMapping("advanceSearch")
    @ApiOperation("所有资源分页检索")
    public JsonResult getAllSourcePage(@RequestBody  SearchParams params,HttpServletRequest request){
        Integer current;
        //设置分页
        if (params.getPageNum() == null){
            current = 0;
        }else {
            current = params.getPageNum()-1;
        }
        if (params.getPageSize() == null){
            params.setPageSize(20);
        }
        PageRequest pageRequest = PageRequest.of(current,params.getPageSize());
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
        //排序
        if (params.getSortMap()!= null){
            nativeSearchQueryBuilder = setSort(params.getSortMap());
        }
        // 聚类
        List<String> list = new ArrayList<>();
        list.add("workType");
        list.add("classTypesArrays");
        list.add("keywordsArrays");
        list.add("writersArrays");
        List<TermsAggregationBuilder> termsAggregationBuilders = addAggregation(list);
        for (TermsAggregationBuilder item:termsAggregationBuilders) {
            nativeSearchQueryBuilder.addAggregation(item);
        }
        setParams(params,boolQueryBuilder,nativeSearchQueryBuilder);
        NativeSearchQuery build = nativeSearchQueryBuilder
                .withQuery(boolQueryBuilder)
                .withPageable(pageRequest)
                .build();
        Map<String, Object> pageList = documentService.getPageList(build, pageRequest);
        return JsonResult.Get().putRes(pageList);
    }

    /**
     * 所有资源查询的时候组装条件
     * @author Innocence
     * @date 2021/3/23
     * @param params
     * @param boolQueryBuilder
     * @return void
     */
    @SensitiveTag
    private void setParams(SearchParams params,BoolQueryBuilder boolQueryBuilder,NativeSearchQueryBuilder nativeSearchQueryBuilder){
        List<String> highList = new ArrayList<>();
        List<String> classTypes = params.getClassTypes();
        //学科分类检索
        if (classTypes != null && classTypes.size()>0){
            TermsQueryBuilder termsQueryBuilder = QueryBuilders.termsQuery("classTypesArrays.keyword", classTypes);
            boolQueryBuilder.must(termsQueryBuilder);
        }
        //1、简单检索(可能什么值都不传,那就查询全部数据)
        if (params.getSimpleSearchParams() != null && StrUtil.isNotBlank(params.getSimpleSearchParams().searchKeyword)){
            List<String> fields = getField(params.getSimpleSearchParams().getSearchField());
            fields.forEach(item->highList.add(item));
            DisMaxQueryBuilder queryBuilder = QueryBuilders.disMaxQuery().tieBreaker(0.05f);
            Boolean isExact = params.getSimpleSearchParams().getIsExact();
            fields.forEach(item->{
                if (item.contains("title") || item.contains("media") || item.contains("book")){
                    MatchQueryBuilder boost = QueryBuilders
                            .matchQuery(item, params.getSimpleSearchParams().searchKeyword).boost(2f);
                    queryBuilder.add(boost);
                }else{
                    if (isExact != null && isExact == true){
                        QueryBuilder matchPhraseQueryBuilder = QueryBuilders.termQuery(item, params.getSimpleSearchParams().searchKeyword);
                        queryBuilder.add(matchPhraseQueryBuilder);
                    }else {
                        QueryBuilder matchPhraseQueryBuilder = QueryBuilders.matchPhraseQuery(item, params.getSimpleSearchParams().searchKeyword);
                        queryBuilder.add(matchPhraseQueryBuilder);
                    }
                }
            });
            boolQueryBuilder.must(queryBuilder);
        }
        //2、二次检索
        if (params.getSecondSearchParams() != null && params.getSecondSearchParams().size()>0){
            List<SearchModel> models = params.getSecondSearchParams();
            models.forEach(item->{
                List<String> fields = getField(item.getSearchField());
                fields.forEach(h->highList.add(h));
                DisMaxQueryBuilder queryBuilder = QueryBuilders.disMaxQuery().tieBreaker(0.05f);
                fields.forEach(e->{
                    if (e.contains("title") || e.contains("media") || e.contains("book")){
                        MatchPhraseQueryBuilder boost = QueryBuilders
                                .matchPhraseQuery(e, item.getSearchKeyword()).boost(2f);
                        queryBuilder.add(boost);
                    }else{
                        MatchPhraseQueryBuilder matchPhraseQueryBuilder = QueryBuilders
                                .matchPhraseQuery(e, item.getSearchKeyword());
                        queryBuilder.add(matchPhraseQueryBuilder);
                    }
                });
                boolQueryBuilder.must(queryBuilder);
            });
        }
        //3、聚类检索
        if (params.getFacetSearchParams() != null && params.getFacetSearchParams().size() > 0) {
            params.getFacetSearchParams().forEach((item) -> {
                List<String> fields = getField(item.getSearchField());
                fields.forEach(h->highList.add(h));
                DisMaxQueryBuilder queryBuilder = QueryBuilders.disMaxQuery().tieBreaker(0.05f);
                fields.forEach(e -> {
                    if(e.equals(SearchFiled.WORKTYPE.getValue())){
                        MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery(e, item.getSearchKeyword());
                        queryBuilder.add(matchQueryBuilder);
                    }else {
                        MatchPhraseQueryBuilder matchPhraseQueryBuilder = QueryBuilders.matchPhraseQuery(e, item.getSearchKeyword());
                        queryBuilder.add(matchPhraseQueryBuilder);
                    }
                });
                boolQueryBuilder.must(queryBuilder);
            });
        }
        //4、高级检索
        if (params.getAdvanceSearchParams() != null && params.getAdvanceSearchParams().size() > 0) {
            BoolQueryBuilder bool=new BoolQueryBuilder();
            params.getAdvanceSearchParams().forEach((item) -> {
                List<String>  fields = getField( item.getSearchField());
                fields.forEach(h->highList.add(h));
                //精确
                BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
                BoolQueryBuilder child=new BoolQueryBuilder();
                if (item.getIsExact()) {
                    fields.forEach(field-> child.should(QueryBuilders.termQuery(field,item.getSearchKeyword())));
                    queryBuilder.must(child);
                }else {
                    MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery(
                            item.getSearchKeyword(), fields.toArray(new String[fields.size()]));
                    queryBuilder.should(multiMatchQueryBuilder);
                }

                if (StrUtil.equals("AND", item.getLogicOperator().trim())) {
                    bool.must(queryBuilder);
                } else if (StrUtil.equals("OR", item.getLogicOperator().trim())) {
                    bool.should(queryBuilder);
                } else if (StrUtil.equals("NOT", item.getLogicOperator().trim())) {
                    bool.mustNot(queryBuilder);
                }
            });
            boolQueryBuilder.must(bool);
        }
        List<String> highFields = filterHighFields(highList);
        nativeSearchQueryBuilder.withHighlightBuilder(setHighlight(highFields));
    }

    /**
     * 根据list设置高亮字段
     * @author Innocence
     * @date 2021/3/23
     * @param list
     * @return org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder
     */
    private HighlightBuilder setHighlight(List<String> list){
        List<String> newList = new ArrayList<>();
        for (int i = 0; i < list.size(); i++) {
            if (list.get(i).indexOf(",") != -1){
                String[] split = list.get(i).split(",");
                for (int j = 0; j < split.length; j++) {
                    newList.add(split[j]);
                }
            }
            newList.add(list.get(i));
        }
        String[] strings = newList.toArray(new String[newList.size()]);
        HighlightBuilder highlightBuilder = documentService.getHighlightBuilder(strings);
        return highlightBuilder;
    }

    /**
     * 根据检索对象和缩略词获取字段
     * @param filed   检索字段
     * @return {@link String} 形如  title,title_c
     * @author 01
     * @date 2021/1/19 18:40
     */
    private List<String> getField( SearchFiled filed) {
        List<String> fields = new ArrayList<>();
        //任意字段
        if (StrUtil.equals(filed.getValue(), SearchFiled.ALL.getValue())) {
            SearchFiled[] values = SearchFiled.values();
            for (SearchFiled value : values) {
                if (StrUtil.equals(value.getValue(), SearchFiled.ALL.getValue())
                        || StrUtil.equals(value.getValue(), SearchFiled.GCH.getValue())
                        || StrUtil.equals(value.getValue(), SearchFiled.PUBLISHER.getValue())) {
                    continue;
                }
                fields.add(value.getValue());
            }
            //标题
        }else if (StrUtil.equals(filed.getValue(),SearchFiled.TITLE.getValue())){
            String titleValue = SearchFiled.TITLE.getValue();
            for (String value : titleValue.split(",")) {
                fields.add(value);
            }
        } else {
            fields.add(filed.getValue());//单独字段
        }
        List<String> strings = new ArrayList<>();
        fields.forEach(item ->{
            if (item.indexOf(",") !=-1){
                String[] split = item.split(",");
                for (int i = 0; i < split.length; i++) {
                    strings.add(split[i]);
                }
            }else {
                strings.add(item);
            }
        });
        return strings;
    }

    /**
     * 组装排序字段
     * @author Innocence
     * @date 2021/3/18
     * @param map
     * @return java.util.List<org.elasticsearch.search.sort.SortBuilder>
     */
    private NativeSearchQueryBuilder setSort(Map<String, SortOrder> map){
        NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
        for(Map.Entry<String, SortOrder> entry : map.entrySet()){
            SortBuilder order;
            if (entry.getKey().startsWith("_")){
                if (entry.getValue().equals(SortOrder.ASC)){
                    order = SortBuilders.fieldSort(entry.getKey()).order(SortOrder.ASC);
                }else {
                    order = SortBuilders.fieldSort(entry.getKey()).order(SortOrder.DESC);
                }
            }else {
                if (entry.getValue().equals(SortOrder.ASC)){
                    order = SortBuilders.fieldSort(entry.getKey()+".keyword").order(SortOrder.ASC);
                }else {
                    order = SortBuilders.fieldSort(entry.getKey()+".keyword").order(SortOrder.DESC);
                }
            }
            nativeSearchQueryBuilder.withSort(order);
        }
        return nativeSearchQueryBuilder;
    }

    /**
     * 组装聚类的字段
     * @author Innocence
     * @date 2021/3/18
     * @param fields 需要聚类的字段
     * @return java.util.List<org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder>
     */
    private List<TermsAggregationBuilder> addAggregation(List<String> fields){
        List<TermsAggregationBuilder> builders = new ArrayList<>();
        fields.forEach(item->{
            TermsAggregationBuilder size;
            if (item.equals("workType")){
                size = AggregationBuilders.terms(item).field(item+".keyword").size(150);
            }else {
                size = AggregationBuilders.terms(item).field(item).size(150);
            }
            builders.add(size);
        });
        return builders;
    }
    

    /**
     * 高亮字段不能为数组字段,这里要处理一下
     * @author Innocence
     * @date 2021/3/24
     * @return void
     */
    private List<String> filterHighFields(List<String> fields){
        List<String> myList = fields.stream().distinct().collect(Collectors.toList());
        List<String> strings = new ArrayList<>();
        for (int i = 0; i < myList.size(); i++) {
            if (myList.get(i).equals(SearchFiled.CLASS_TYPES_ARRAYS.getValue())){
                strings.add(SearchFiled.CLASS_TYPES.getValue());
            }else if(myList.get(i).equals(SearchFiled.KEYWORDS_ARRAYS.getValue())){
                strings.add(SearchFiled.KEYWORD.getValue());
            }else if (myList.get(i).equals(SearchFiled.WRITERS_ARRAYS.getValue())){
                strings.add(SearchFiled.WRITER.getValue());
            }else if(myList.get(i).equals(SearchFiled.COLLECT_DATABASE_ARRAYS.getValue())) {
                strings.add(SearchFiled.COLLECT_DATABASE.getValue());
            }else if(myList.get(i).equals("workType")){
                continue;
            }else {
                strings.add(myList.get(i));
            }
        }
        return strings.stream().distinct().collect(Collectors.toList());
    }

}

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐