CASE: Challenger Arm Sampling for Efficient In-Context Reasoning

Kiran Purohit; Venktesh V; Sourangshu Bhattacharya; Avishek Anand

CASE: Challenger Arm Sampling for Efficient In-Context Reasoning

Kiran Purohit, Venktesh V, Sourangshu Bhattacharya, Avishek Anand

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: In-Context Learning, Large Language Models, Exemplar Selection, Stochastic Linear Bandits, Challenger Arms

TL;DR: An efficient gap index based formulation for identifying m best arms (exemplar subsets) for In-Context Learning.

Abstract: The in-context learning paradigm with LLMs has been instrumental in advancing applications that require complex reasoning over natural language. An optimal selection of few-shot examples (exemplars) is essential for constructing effective prompts under a limited budget. In this paper, we frame the problem of exemplar selection for In-Context Reasoning (ICR) as a top-m best arms identification problem. A key challenge in this context is the exponentially large number of arms that need to be evaluated to identify the m-best arms. We propose CASE (Challenger Arm Sampling for Exemplar selection), a novel selective exploration strategy that maintains a shortlist of ``challenger'' arms, which are current candidates for the top-m arms. In each iteration, only the arms from this shortlist and the current top-m set are pulled, thereby reducing sample complexity and, consequently, the number of LLM evaluations. Furthermore, we model the scores of exemplar subsets (arms) using a parameterized linear scoring function, leading to a stochastic linear bandits setting. In this setting, CASE identifies the top-m arms with significantly fewer evaluations than existing state-of-the-art methods. CASE effectively works with black box LLMs and selects a static set of few-shot examples, resulting in an extremely efficient scheme for in-context reasoning. The exemplars selected with CASE show surprising performance gains of up to 15.19% compared to state-of-the-art exemplar selection methods. We release our code and data (https://anonymous.4open.science/r/CASE_exemplar_bandits-7403).

Supplementary Material: zip

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10006

Loading