Keywords: Small language models, large language models, in-context learning, natural language processing, test-time adaption
TL;DR: We introduce AdaptMI, an adaptive approach to selecting skill-based in-context math instructions for small language models that boosts performance by 6%.
Abstract: In-context learning (ICL) enhances language model performance by providing relevant contextual information.
Recent works (Didolkar et al., 2024a;b) show that ICL performance can be improved by leveraging a frontier large
language model’s (LLM) ability to predict required skills to solve a problem, popularly referred to as an LLM’s
metacognition, and using the recommended skills to construct necessary in-context examples. While this improves
performance in larger models, smaller language models (SLMs) see minimal benefit, revealing a performance gap.
We show that skill-based prompting can hurt SLM performance on easy questions by introducing unnecessary
information, akin to cognitive overload. To mitigate this, we introduce AdaptMI, an Adaptive strategy for
selecting skill-based Math Instructions. Guided by cognitive load theory, AdaptMI introduces skill-based
examples only when the model performs poorly. We further propose AdaptMI+ , which provides targeted
examples for specific missing skills. In 5-shot evaluations on popular math benchmarks and five SLMs (1B–7B;
Qwen, Llama), AdaptMI+ improves accuracy by up to 6% compared to naive skill-based methods.
Code: zip
Submission Number: 50
Loading