FASETS: Discovering Faceted Sets of Entities

Published: 01 Jan 2024, Last Modified: 05 Aug 2024WWW (Companion Volume) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Computing related entities for a given seed entity is an important task in exploratory search and comparative data analysis.Prior works, using the seed-based set expansion paradigm, have focused on the single aspect of identifying homogeneous sets with high pairwise relatedness. A few recent works discuss cluster-based approaches to tackle multi-faceted set expansion, however, they fail in harnessing the specificity of the clusters and generating an explanation for them. This paper poses the multi-faceted set expansion as an optimization problem, where the goal is to compute multiple groups of entities that convey different aspects in an explainable manner, with high similarity within each group and diversity across groups. To extend a seed entity, we collect a large pool of candidate entities and facets (e.g., categories)from Wikipedia and knowledge bases, and construct a candidate graph. We propose FASETS, an efficient algorithm for computing faceted groups of bounded size, based on random walks over the candidate graph. Our extensive evaluation shows the superiority of FASETS against prior baselines, with regard to ground-truth collected from crowdsourcing.
Loading