Keywords: Multiple-choice questions, Mathematical misconceptions, Large language models, Mathematical reasoning, Zero-shot inference, In-context learning, Fine-tuning, Multi-vector retrieval, Misconception classification
Abstract: Multiple-choice questions are widely used to evaluate student knowledge. Well-designed questions use distractors that are associated with common misconceptions. Large language models (LLMs) have performed well in math reasoning benchmarks, but they struggle with understanding misconceptions because they do not reason in the same way humans do. We propose a method to improve the zero-shot inference capability of LLMs based on in-context learning and fine tuning for multi-step mathematical reasoning. Our system will classify the misconceptions into 2,586 categories using multi-vector embedding retrieval. We will evaluate the mean average precision when classifying the misconceptions in a Kaggle dataset of 1,868 math-related multiple-choice questions. This model will establish a foundation for using LLMs to assess incorrect answers in math education.
Submission Number: 36
Loading