admin

Spring Data Elasticsearch 全文检索高亮显示
Spring-data-elasticsearch 版本:3.1.4query.withPageable(page...
扫描右侧二维码阅读全文
05
2019/03

Spring Data Elasticsearch 全文检索高亮显示

Spring-data-elasticsearch 版本:3.1.4

query.withPageable(pageable)
        .withQuery(boolQuery)
        .withHighlightFields(new HighlightBuilder.Field("fileName"));
Page<Torrent> page = template.queryForPage(query.build(), Torrent.class, new SearchResultMapper() {

    @Override
    public <T> AggregatedPage<T> mapResults(SearchResponse response, Class<T> clazz, Pageable pageable) {

        long totalHits = response.getHits().getTotalHits();
        float maxScore = response.getHits().getMaxScore();

        List<Torrent> results = new ArrayList<>();
        for (SearchHit hit : response.getHits().getHits()) {
            if (hit == null)
                continue;
            Torrent result;
            result = JSON.parseObject(hit.getSourceAsString(), Torrent.class);
            result.setInfoHash(hit.getId());
            if (hit.getHighlightFields().containsKey("fileName"))
                result.setFileName(hit.getHighlightFields().get("fileName").fragments()[0].toString());
            else
                result.setFileName((String) hit.getSourceAsMap().get("fileName"));
            results.add(result);
        }
        return new AggregatedPageImpl<>((List<T>) results, pageable, totalHits, response.getAggregations(), response.getScrollId(),
                    maxScore);
    }
});

主要就是实现SearchResultMapper映射接口,在里面进行高亮结果映射到对象字段。
withHighlightFields(new HighlightBuilder.Field("fileName"))默认会给高亮单词添加<em></em>元素标签,也可以通过调用preTags()postTags()方法来自定义。

在这里为了简单起见,我是直接在重写mapResults方法里面返回List<Torrent>给分页对象,打印SearchHit对象里面内容如下:

{
    "fields":{

    },
    "fragment":false,
    "highlightFields":{

    },
    "id":"a489303220cf4e33d75197eae471a5a2b77cbc2d",
    "matchedQueries":[

    ],
    "score":null,
    "shard":{
        "fullyQualifiedIndexName":"dodder",
        "index":"dodder",
        "nodeId":"5n5CODjxStSb5lfHx3sJyw",
        "nodeIdText":{
            "fragment":true
        },
        "shardId":{
            "fragment":true,
            "id":4,
            "index":{
                "fragment":false,
                "name":"dodder",
                "uUID":"DEHtBBaDRdaFtp0VmgF-bQ"
            },
            "indexName":"dodder"
        }
    },
    "sortValues":[
        1551719030508
    ],
    "sourceAsMap":{
        "fileName":"Call.The.Midwife.S05E07.HDTV.Subtitulado.Esp.SC.avi",
        "fileSize":575272960,
        "fileType":"视频",
        "createDate":1551719030508
    },
    "sourceAsString":"{"fileName":"Call.The.Midwife.S05E07.HDTV.Subtitulado.Esp.SC.avi","fileType":"视频","fileSize":575272960,"createDate":1551719030508,"files":null}",
    "sourceRef":{
        "childResources":[

        ],
        "fragment":true
    },
    "type":"torrent",
    "version":1
}

关键看sourceAsMapsourceAsString字段,这两个字段包含了实体类的各个属性,前者是Map对象,而后者是一个Json字符串,这里要做的是将搜索关键字高亮显示,所以还要注意highlightFields字段,我这里没有带关键字搜索,所以这个字段为空。
实际上我上面的实现是直接针对Torrent索引结果来实现的,不能通用,而接口里面的方法返回的<T> AggregatedPage<T>是一个泛型,所以完全是可以设计成一个支持高亮结果的通用组件直接注入到 Spring 容器,这样一来,所有的查询都能够共用这一个SearchResultMapper

我们查找SearchResultMapper的实现类,发现官方有一个实现类:DefaultResultMapper
查看其mapResults方法实现:

@Override
public <T> AggregatedPage<T> mapResults(SearchResponse response, Class<T> clazz, Pageable pageable) {

    long totalHits = response.getHits().getTotalHits();
    float maxScore = response.getHits().getMaxScore();
    //泛型结果
    List<T> results = new ArrayList<>();
    for (SearchHit hit : response.getHits()) {
        if (hit != null) {
            T result = null;
            //这里根据上面的 json 可知,hit.getSourceAsString() 实际上就是实体类的 json 字符串
            //如果不为空,调用 mapEntity 将其转换为 java 对象
            if (!org.springframework.util.StringUtils.isEmpty(hit.getSourceAsString())) {
                result = mapEntity(hit.getSourceAsString(), clazz);
            } else {
                //如果 hit.getSourceAsString() 为空,则先将 hit.getFields().values() 转换为
                //json 字符串,然后再转为 java 对象,实际上我们也可以将 sourceAsMap 转为 json
                result = mapEntity(hit.getFields().values(), clazz);
            }
            //注意上面的 json 转 java 对象是不包含 ID 属性的,ID 字段名是不固定的
            //实际上里面实现应该是利用反射来赋值的
            setPersistentEntityId(result, hit.getId(), clazz);
            setPersistentEntityVersion(result, hit.getVersion(), clazz);
            setPersistentEntityScore(result, hit.getScore(), clazz);

            populateScriptFields(result, hit);
            results.add(result);
        }
    }

    return new AggregatedPageImpl<T>(results, pageable, totalHits, response.getAggregations(), response.getScrollId(),
            maxScore);
}

上面的代码返回结果使用了泛型,所以通用性肯定比我之前写的要好,这里我只不过偷了个懒,写死转换为Torrent对象,参照上面的代码我们完全可以自定义一个通用的支持高亮结果的SearchResultMapper
实现方法我们参考setPersistentEntityId方法:


private <T> void setPersistentEntityId(T result, String id, Class<T> clazz) {
        
    if (clazz.isAnnotationPresent(Document.class)) {
            
        ElasticsearchPersistentEntity<?> persistentEntity = mappingContext.getRequiredPersistentEntity(clazz);
        ElasticsearchPersistentProperty idProperty = persistentEntity.getIdProperty();

        // Only deal with String because ES generated Ids are strings !
        if (idProperty != null && idProperty.getType().isAssignableFrom(String.class)) {
            persistentEntity.getPropertyAccessor(result).setProperty(idProperty, id);
        }
    }
}

上面是设置 ID 字段,我们要做的是高亮,所以将SearchHit中的highlightFields取出来遍历,利用persistentEntity.getIdProperty()找到要高亮的字段进行替换原来的值。
简单写了一下如下:

//遍历高亮结果字段
for (Map.Entry<String, HighlightField> entry : hit.getHighlightFields().entrySet()) {
    if (clazz.isAnnotationPresent(Document.class)) {

        ElasticsearchPersistentEntity<?> persistentEntity = mappingContext.getRequiredPersistentEntity(clazz);
        //获取高亮字段
        ElasticsearchPersistentProperty highlightProperty = persistentEntity.getPersistentProperty(entry.getKey());

        // Only deal with String because ES generated Ids are strings !
        if (highlightProperty != null && highlightProperty.getType().isAssignableFrom(String.class)) {
            //将高亮结果替换原来的值
            persistentEntity.getPropertyAccessor(result).setProperty(highlightProperty, entry.getValue().fragments()[0].toString());
        }
    }
}

上面代码有待实践,不出意外就是上面那样子搞的~

Last modification:March 5th, 2019 at 11:18 pm
If you think my article is useful to you, please feel free to appreciate

Leave a Comment