Keywords: RAG, LLM, Sarvam, Intelligent Tutoring System, Deep Learning in Education
TL;DR: An AI-powered tutoring system that combines LLMs and retrieval-based methods to deliver NCERT-aligned support
Abstract: This work introduces a multi-task, deep learning-based tutoring system tailored to the NCERT curriculum, leveraging Retrieval-Augmented Generation (RAG) and specialized Large Language Models (LLMs) to perform curriculum-aligned educational tasks. The system supports three primary functionalities: (1) question answering grounded in textbook content, (2) automated question generation using a fine-tuned T5 model, and (3) multilingual answering via integration with Sarvam, an LLM capable of generating responses in Indian languages.
The core architecture adopts a unified RAG pipeline, where NCERT textbooks are tokenized, semantically embedded, and indexed using dense vectors. Retrieved passages are appended as context to generative models, ensuring curriculum-grounded outputs. LangChain orchestrates the retrieval and generation flow, enhancing domain fidelity and factual accuracy across tasks.
To improve the accuracy and curriculum alignment of automated question generation, we systematically evaluate several fine-tuning strategies for the T5 model. Specifically, we compare (1) the base T5 model, (2) T5 fine-tuned on NCERT questions and solutions, (3) T5 tuned with GPT-4-based self-instruct data, and (4) a combined method employing both parameter-efficient fine-tuning (PEFT) and self-instruct tuning. Among these, the combination of PEFT with self-instruct emerges as the most effective approach, consistently achieving higher accuracy, better lexical alignment, and greater question diversity.
Submission Number: 16
Loading