Abstract: Previous research often utilizes symbolic distillation to transfer the reasoning abilities of large teacher models to smaller student models. However, when it comes to multi-choice machine reading comprehension (MMRC), solely learning from the rationales generated by the teacher model for correct options overlooks educational significance of understanding the reasons behind incorrect options. In education, metacognition requires individuals to actively identify errors when reading to deepen their understanding. To this end, we propose a novel framework for achieving metacognitive symbolic distillation. Initially, we prompt the teacher large language model (LLM) to generate rationales for each option in the MMRC dataset. Subsequently, the student model could be fine-tuned based on the MMRC data equipped with these rationales. Our experiments on two MMRC datasets demonstrate that our approach effectively enhances the performance of the small model compared with standard fine-tuned models and symbolic distilled models. Moreover, when the student model is large enough, upgrading the teacher model can lead to further improvements. We will make our code and data publicly available.
Paper Type: short
Research Area: Question Answering
Languages Studied: English
Preprint Status: We are considering releasing a non-anonymous preprint in the next two months (i.e., during the reviewing process).
A1: yes
A2: yes
A3: yes
B: yes
C: yes
D: no
E: yes
0 Replies
Loading