Evaluation, Analysis, and Mitigation of Shortcut Learning for Large Language Models in In-Context LearningDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Recent studies have confirmed that Pre-trained Language Models (PLMs) have a tendency to shortcut learning, thus producing a sharp drop in performance under distribution shift. However, most existing approaches focus only on shortcut learning in fine-tuned lightweight PLMs and cannot bridge the gap with Large Language Models (LLMs). In addition, how to evaluate LLMs dependence on shortcuts and how to alleviate the dependence on shortcuts still need extensive and in-depth research. Therefore, motivated by the above challenges, this paper proposes a benchmark containing two common text classification tasks to analyze and quantify the impact of shortcuts on LLMs in In-Context Learning (ICL). Then, we explain the shortcut learning of LLMs from the perspective of information flow: LLMs tend to make one-sided inferences by using the association between repeated shortcuts and labels in context. Finally, we evaluate several prompt-based shortcut mitigation strategies that lead to more robust predictions from the LLMs. Our work establishes a set of LLMs' shortcut research processes from assessment to analysis to mitigation, and provides new insights into LLMs shortcut learning.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview