Hierarchical Demonstration Order Optimization for Many-shot In-Context Learning

Yinhan He; Wendy Zheng; Song Wang; Zaiyi Zheng; Yushun Dong; Yaochen Zhu; Jundong Li

Hierarchical Demonstration Order Optimization for Many-shot In-Context Learning

Yinhan He, Wendy Zheng, Song Wang, Zaiyi Zheng, Yushun Dong, Yaochen Zhu, Jundong Li

23 Sept 2024 (modified: 14 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: In-context learning, Demonstration Order Optimization

TL;DR: We propose a information-theoretical demonstration order quality metric and a hierachical demonstration order optimization framework.

Abstract: In-Context Learning (ICL) is a technique where large language models (LLMs) leverage multiple demonstrations (i.e., examples) to perform tasks. With the recent expansion of LLM context windows, many-shot ICL (generally with more than 50 demonstrations) can lead to significant performance improvements on a variety of language tasks such as text classification and question answering. Nevertheless, ICL faces demonstration order instability (ICL-DOI), which means that performance varies significantly depending on the order of demonstrations. Moreover, the ICL-DOI phenomenon persists and can sometimes be more pronounced in many-shot ICL, validated by our thorough experimental investigation. Current strategies handling ICL-DOI, however, are not applicable to many-shot ICL, since they cannot overcome two critical challenges: (1) Most metrics measuring the quality of demonstration order rely on subjective judgment, lacking a theoretical foundation to achieve precise quality characterization. These metrics are thus non-applicable to many-shot situations, where the order quality of different orders is less distinguishable due to the limited ability of LLMs to exploit information in long input contexts. (2) The requirement to examine all orders is computationally infeasible due to the combinatorial complexity of the order space in many-shot ICL. To tackle the first challenge, we design a demonstration order evaluation metric based on information theory for measuring order quality, which effectively quantifies the usable information gain of a given demonstration order. To address the second challenge, we propose a hierarchical demonstration order optimization method named HIDO that enables a more refined exploration of the order space, achieving high ICL performance without the need to evaluate all possible orders. Extensive experiments on multiple LLMs and real-world datasets demonstrate that our HIDO method consistently and efficiently outperforms other baselines. Our code can be found at https://anonymous.4open.science/r/HIDO-B2DE/.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3256

Loading