Semantic Knowledge Augmented Hypergraph Contrastive Representation Learning for Zero-Shot Biomedical Text Classification

Published: 01 Jan 2025, Last Modified: 25 Sept 2025PAKDD (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Zero-shot biomedical text classification is a fundamental problem in text mining that assigns scientific articles with labels that are unseen during the training time but available during inference. This learning paradigm has practical implications for domains such as biomedicine where new labels or concepts (i.e., novel diseases, genes, drugs) emerge every now and then. While the existing approaches have made significant advances, they fail to effectively leverage the complex semantic relationships between biomedical entities and thus yield unsatisfactory results. To address this issue, we propose a new approach that leverages a hypergraph structure to capture the high-order semantic relationships between biomedical entities. To further enhance the expressive power of hypergraphs, we propose a novel augmentation strategy that leverages semantic knowledge present in the biomedical domain to generate augmented hypergraph views. Taken together, the proposed approach generates robust feature representation of biomedical entities needed for achieving better generalization performance in unseen labels. Extensive experiments on the largest biomedical corpus validate the effectiveness of proposed approach.
Loading