Follow the Path: Hierarchy-Aware Extreme Multi-Label Completion for Semantic Text Tagging

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24 OralEveryoneRevisionsBibTeX
Keywords: Semantic Tagging, Taxonomy, Extreme Multi-Label Classification, Label Completion, Transformers
Abstract: Extreme Multi Label (XML) problems, and in particular XML completion – the task of prediction the missing labels of an entity – have attracted significant attention in the past few years. Most XML completion problems can organically leverage a label hierarchy, which can be represented as a tree that encodes the relations between the different labels. In this paper, we propose a new algorithm, HECTOR – Hierarchical Extreme Completion for Text based on transfORmer, to solve XML Completion problems more effectively. HECTOR operates by directly predicting paths in this tree, instead of simple labels, thus taking advantage of information encoded in the hierarchy. Due to the sequential aspect of these paths, HECTOR can leverage the effectiveness and performance of the Transformer architecture to outperform state-of-the-art of XML completion methods. Extensive evaluations on three real-world datasets demonstrate the effectiveness of our approach for XML completion. We compare HECTOR with several state-of-the-art XML completion methods for various completion problems, and in particular for label refinement, i.e., the scenario where only the coarse labels (i.e. the first few top levels in a taxonomy) are observed. Empirical results on three different datasets show that our method significantly outperforms the state of the art, with HECTOR frequently outperforming previous techniques by more than 10% according to multiple metrics.
Track: Semantics and Knowledge
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: Yes
Submission Number: 1446
Loading