A Three-Tier LLM Framework for Forecasting Student Engagement from Qualitative Longitudinal Data

Ahatsham Hayat; Helen Martinez; Bilal Khan; Mohammad Rashedul Hasan

A Three-Tier LLM Framework for Forecasting Student Engagement from Qualitative Longitudinal Data

Ahatsham Hayat, Helen Martinez, Bilal Khan, Mohammad Rashedul Hasan

Published: 24 May 2025, Last Modified: 16 Jun 2025CoNLL 2025 ConditionalEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Student Engagement, Longitudinal Experiential Data, Qualitative Data, Large Language Models, Missing-Not-At-Random, Imputation, Zero-Shot Learning, Feature Selection, Fine-Tuning, Educational Analytics, Time-Series Data, Textual Reasoning

TL;DR: A three-tier LLM framework forecasts student engagement from qualitative longitudinal data, outperforming numeric baselines with textual reasoning and feature selection.

Abstract: Forecasting nuanced shifts in student engagement from longitudinal experiential (LE) data—multi-modal, qualitative trajectories of academic experiences over time—remains challenging due to high dimensionality and missingness. We propose a natural language processing (NLP)-driven framework using large language models (LLMs) to forecast binary engagement levels across four dimensions: Lecture Engagement Disposition, Academic Self-Efficacy, Performance Self-Evaluation, and Academic Identity and Value Perception. Evaluated on 960 trajectories from 96 first-year STEM students, our three-tier approach—LLM-informed imputation to generate textual descriptors for missing-not-at-random (MNAR) patterns, zero-shot feature selection via ensemble voting, and fine-tuned LLMs—processes textual non-cognitive responses. LLMs substantially outperform numeric baselines (e.g., Random Forest, LSTM) by capturing contextual nuances in student responses. Encoder-only LLMs surpass decoder-only variants, highlighting architectural strengths for sparse, qualitative LE data. Our framework advances NLP solutions for modeling student engagement from complex LE data, excelling where traditional methods struggle.

Supplementary Material: pdf

Copyright Agreement: pdf

Submission Number: 147

Loading