Annotation Guideline-Based Knowledge Augmentation: Toward Enhancing Large Language Models for Educational Text Classification

Published: 2025, Last Modified: 14 Nov 2025IEEE Trans. Learn. Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Automated classification of learner-generated text to identify behavior, emotion, and cognition indicators, collectively known as learning engagement classification (LEC), has received considerable attention in fields such as natural language processing(NLP), learning analytics, and educational data mining. Recently, large language models (LLMs), such as ChatGPT, which are considered promising technologies for artificial general intelligence, have demonstrated remarkable performance in various NLP tasks. However, their capabilities in LEC tasks still lack comprehensive evaluation and improvement approaches. This study introduces a novel benchmark for LEC, encompassing six datasets that cover behavior classification (question and urgency level), emotion classification (binary and epistemic emotion), and cognition classification (opinion and cognitive presence). In addition, we propose the annotation guideline-based knowledge augmentation (AGKA) approach, which leverages GPT-4.0 to recognize and extract label definitions from annotation guidelines and applies random undersampling to select a representative set of examples. Experimental results demonstrate the following: AGKA enhances LLM performance compared to vanilla prompts, particularly for GPT-4.0 and Llama-3 70B; GPT-4.0 and Llama-3 70B with AGKA are comparable to fully fine-tuned models such as BERT and RoBERTa on simple binary classification tasks; for multiclass tasks requiring complex semantic understanding, GPT-4.0 and Llama-3 70B outperform the fine-tuned models in the few-shot setting but fall short of the fully fine-tuned models; Llama-3 70B with AGKA shows comparable performance to GPT-4.0, demonstrating the viability of these open-source alternatives; and the ablation study highlights the importance of customizing and evaluating knowledge augmentation strategies for each specific LLM architecture and task.
Loading