In-context Curriculum for Mathematical Reasoning in Small Language Models

24 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: small language models, mathematical reasoning, in-context learning, specialization, Chain-of-thought prompting, deep learning, transformers, large language models
Abstract: Specializing Small Language Models (SLMs) in mathematical reasoning improves the scaling of model performance and reduces the cost of inference. Leveraging the model's context is key for specialization and parameter-free adaptation in the In-context Learning (ICL) paradigm. In the case of Large Language Models (LLMs), more reasoning steps in the chain-of-thought (COT) based demonstrations within an ICL prompt are known to result in higher accuracy during testing on mathematical reasoning datasets such as GSM8K. Although SLMs have limited capability for multi-step COT reasoning, prior works in specializing SLMs use multi-step COT-based demonstrations to encapsulate model context. We propose an alternative termed the In-context Curriculum Random (ICCR) prompt which varies the complexity of demonstrations, ranging from a simple single COT-based reasoning step to more complex multi-step COT-based demonstrations. ICCR achieves a 16.15\% inference accuracy on the GSM8K dataset, surpassing the 14.33\% accuracy displayed by the GPT 3.5-distilled COT baseline for SLM specialization. Unlike the aforementioned baseline, ICCR uses out-of-distribution datasets, i.e., ASDiv, SVAMP, and MathQA, which serve to emphasize simpler COT-based reasoning prompts. In the context of ICL, basic arithmetic calculation-based demonstrations in a natural language format are shown to outperform both the baseline and ICCR prompts on the Google FLAN-T5 XL and XXL models. We conclude that at model scales from 250M to 11B parameters, simpler COT-based reasoning prompts result in higher performance.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8651
Loading