LDA-Based Resource Selection for Results Diversification in Federated Search

Published: 01 Jan 2018, Last Modified: 17 Apr 2025WISA 2018EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Resource selection is an important step in federated search environment, especially for search result diversification. Most of prior work on resource selection in federated search only considered relevance of the resource to the information need, and very few considered both relevance and diversification of the information inside them. In this paper, we propose a method that uses the Latent Dirichlet Allocation (LDA) model to discover underlying topics in each resource by sampling a number of documents from it. Thus the vector representation of each resource can be used to calculate the similarity between different resources and to decide the diversity of them. Using a group of diversity-related metrics, we find that the LDA-based resource selection method is more effective than other state-of-the-art methods in the same category.
Loading