Local-Global Cascaded Ensemble Learning on Hybrid Experts for Knowledge Concept Tagging

Published: 2025, Last Modified: 22 Jan 2026AIED (4) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Knowledge concept tagging aims to tag exercises with specific knowledge concepts. Traditional manual annotation methods are increasingly unable to meet the demands of annotating large-scale, high-quality data. While automated methods have been developed to streamline this process, they still struggle with fine-grained knowledge concept labeling. This limitation arises from the inherent difficulty of accurately annotating text content with specific knowledge concepts selected from a vast pool of candidate concepts, and the similarity of fine-grained knowledge concepts further exacerbates this challenge. In this paper, we propose a Local-Global Cascaded Ensemble Learning (LGCEL) method based on hybrid experts for knowledge concept tagging to address this issue. Specifically, LGCEL first conducts unsupervised domain continual pre-training, e.g., masked language modeling and causal language modeling, to obtain in-domain models, then performs supervised fine-tuning to achieve diverse tagging experts, and finally collaborates hybrid fine-trained models including lightweight large language models (LLMs) for voting ensembling. Experimental results demonstrate the effectiveness of LGCEL in annotating fine-grained knowledge concepts and its superiority in integrating mixed tagging experts to improve the annotation accuracy of hard-to-distinguish knowledge concepts.
Loading