Keywords: In-Context Learning,Label Unreliability,Label Uncertainty, Label Corruption
Abstract: In-context learning (ICL) enables large language models (LLMs) to solve downstream tasks with a small set of labeled demonstrations. A central problem in ICL is how to select these demonstrations, and most methods approach it empirically through similarity or diversity. However, to design effective selection strategies, it is important first to understand what ICL is actually learning, particularly how it depends on the relationship between inputs and labels. To study this, we unify prior studies under the Label Unreliability framework, which captures how unreliable labels can provide imperfect supervision. Viewing ICL as implicitly performing transductive label propagation, we establish a bridge between selection strategies and label unreliability which reveals a key insight: similarity-based selection is highly sensitive to label unreliability, whereas diversity-based selection offers robustness. Effective selection therefore requires balancing similarity, to capture meaningful sample representations, with diversity, to mitigate the effects of imperfect supervision.
Paper Type: Long
Research Area: Language Models
Research Area Keywords: Language Modeling
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 7321
Loading