Leveraging LLMs for Fine-grained Knowledge Component Extraction in Educational AI

Leveraging LLMs for Fine-grained Knowledge Component Extraction in Educational AI

ACL ARR 2026 January Submission3723 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Knowledge Component, Educational AI, LLM, Personalized Tutoring, Think-Aloud Protocol

Abstract: AI tutoring systems require fine-grained skill representations to diagnose student knowledge gaps and deliver targeted instruction. However, existing automated Knowledge Component (KC) extraction methods lack principled strategies for controlling granularity and do not systematically capture cognitive operations required for problem-solving. We present CogKC, a multi-stage LLM-based method that generates think-aloud to externalize problem-solving reasoning, then applies the TIMSS cognitive framework to extract hierarchical KCs with explicit cognitive operations (e.g., recall, calculate, infer). Our approach produces finer-grained representations than expert annotations while maintaining interpretability. Evaluation through personalized question generation and tutoring simulation demonstrates improved question quality (68.2% win rate) and tutoring efficiency (14% gain) compared to baseline methods.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: NLP Applications, Generation, Resources and Evaluation

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 3723

Loading