Sample Efficient Demonstration Selection for In-Context Learning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Top-m Best Arms Identification Scheme for In-context Example Selection
Abstract: The in-context learning paradigm with LLMs has been instrumental in advancing a wide range of natural language processing tasks. The selection of few-shot examples (exemplars / demonstration samples) is essential for constructing effective prompts under context-length budget constraints. In this paper, we formulate the exemplar selection task as a top-m best arms identification problem. A key challenge in this setup is the exponentially large number of arms that need to be evaluated to identify the m-best arms. We propose CASE (Challenger Arm Sampling for Exemplar selection), a novel sample-efficient selective exploration strategy that maintains a shortlist of “challenger” arms, which are current candidates for the top-m arms. In each iteration, only one of the arms from this shortlist or the current top-m set is pulled, thereby reducing sample complexity and, consequently, the number of LLM evaluations. Furthermore, we model the scores of exemplar subsets (arms) using a parameterized linear scoring function, leading to stochastic linear bandits setting. CASE achieves remarkable efficiency gains of up to 7× speedup in runtime while requiring 7× fewer LLM calls (87% reduction) without sacrificing performance compared to state-of-the-art exemplar selection methods. We release our code and data (https://github.com/kiranpurohit/CASE).
Lay Summary: We make Large Language Model (LLM) solve complex tasks by providing them examples demonstrating skills needed to solve those tasks. To choose the best representative samples, we devise an algorithm that can efficiently explore all possible examples from a large database of such samples. The algorithm is similar to pulling arms in slot machines in a smart and optimal manner to obtain maximum reward with limited budget to spend on the machines. A certain combination of samples are considered to be akin to an arm in slot machines in our algorithm. At any point in time we maintain a shortlist of good arms and next best arms and seek to remove the weakest link from set of good arms. The weakest link is the arm which has the greatest threat to be replaced from an arm in the set of next best arms. The next best arms are rotated periodically to ensure we see multiple possible next best arms.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/kiranpurohit/CASE
Primary Area: General Machine Learning->Online Learning, Active Learning and Bandits
Keywords: In-context Learning, Exemplar Selection, Stochastic Linear Bandits, Best Arm Identification
Submission Number: 9158
Loading