CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts

ACL ARR 2025 February Submission1515 Authors

13 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Taxonomies provide structural representations of knowledge and are crucial in various applications. The task of taxonomy expansion involves integrating emerging entities into existing taxonomies by identifying appropriate parent entities for these new query entities. Previous methods rely on self-supervised techniques that generate annotation data from existing taxonomies but are less effective with small taxonomies (fewer than 100 entities). In this work, we introduce CodeTaxo, a novel approach that leverages large language models through code language prompts to capture the taxonomic structure. Extensive experiments on five real-world benchmarks from different domains demonstrate that CodeTaxo consistently achieves superior performance across all evaluation metrics, significantly outperforming previous state-of-the-art methods. The code and data are available at https://anonymous.4open.science/r/CodeTaxo4Review-47DB.
Paper Type: Long
Research Area: Information Extraction
Research Area Keywords: knowledge base construction, zero/few-shot extraction
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 1515
Loading