Keywords: In-context learning, Distribution shift, Robustness
Abstract: In-context Learning (ICL) is a popular approach to filling Large Language Models (LLMs) with the context without fine-tuning. ICL works by feeding the test input along with the context information selected from the candidate dataset as examples of explaining the target task and getting the answer. In real-world applications, noisy samples are easily to be included in the datasets, so it is unavoidable that the candidate set might contain noise caused by human or measurement errors. The effectiveness of ICL is highly dependent on the quality of the selected ICL samples. Thus the noise in the candidate set can severely mislead the query answer and degrade the ICL performance. However, the noise ICL problem is largely overlooked. To tackle this challenge, in this paper, we propose Context Distribution Shift (ConDS), which iteratively revises the distribution of the candidate dataset so that the retrieved ICL samples are emphasized to improve the robustness of ICL. Specifically, we first identify the informative samples based on the retriever ranking score and the feedback from the LLMs, and then augment the identified informative samples. A subsampling strategy is also adopted to emphasize the importance of informative samples and decrease the size of noisy samples. Thus, ICL's reliability can be improved by reducing the catastrophic impact of noisy samples on almost all test queries to a small percentage. Our ConDS can be easily combined with existing off-the-shelf and fine-tuned retrievers. An analysis is also provided to reveal the relationship between ConDS and retrievers. Experimental results show that ConDS outperforms baselines on various tasks under the influence of noise by a large margin of 8.12\%.
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7395
Loading