Abstract: This article presents an intervention study on the effects of the combined methods of 1) the Socratic method, 2) chain-of-thought (CoT) reasoning, 3) simplified gamification, and 4) formative feedback on university students’ math learning driven by large language models (LLMs). We call our approach Mathematics Explanations through Games by AI LLMs (MEGA). Some students struggle with math, and as a result, avoid math-related disciplines or subjects despite the importance of math across many fields, including signal processing. Oftentimes, students’ math difficulties stem from suboptimal pedagogy. We compared the MEGA method to the traditional step-by-step (CoT) method to ascertain which is better by using a within-group design after randomly assigning questions for the participants, who are university students. Samples ${(}{n}{=}{60}{)}$ were randomly drawn from each of the two test sets of the Grade School Math 8 K (GSM8K) and Mathematics Aptitude Test of Heuristics (MATH) datasets, based on an error margin of 11%, a confidence level of 90%, and a manageable number of samples for the student evaluators. These samples were used to evaluate two capable LLMs at length [Generative Pretrained Transformer 4o (GPT4o) and Claude 3.5 Sonnet] out of the initial six that were tested for capability. The results showed that students agree in more instances that the MEGA method is experienced as better for learning for both datasets. It is even much better than the CoT (47.5% compared to 26.67%) in the more difficult MATH dataset, indicating that MEGA is better at explaining difficult math problems. We also calculated the accuracies of the two LLMs and showed that model accuracies differ for the methods. MEGA appears to expose the hallucination challenge that still exists with these LLMs better than CoT. We provide public access to the MEGA app, the preset instructions that we created, and the annotations by the students for transparency.
External IDs:dblp:journals/spm/AdewumiLLGAM25
Loading