Mitigating Copy Bias in In-Context Learning through Neuron Pruning

Mitigating Copy Bias in In-Context Learning through Neuron Pruning

ACL ARR 2025 July Submission525 Authors

28 Jul 2025 (modified: 30 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) have demonstrated impressive few-shot in-context learning (ICL) abilities. Still, we show that they are sometimes prone to a `copying bias', where they copy answers from provided examples instead of learning the underlying patterns. In this work, we propose a novel and simple method to mitigate such copying bias. First, we create a synthetic task and use the Integrated Gradients method to identify neurons that prioritize copying over generalization. We demonstrate that pruning these neurons consistently improves performance across a diverse set of ICL tasks, including both single-token and multi-token scenarios, while maintaining or even improving the model's general capabilities. We also show that our method is applicable across various LLM architectures, including Transformers and State-Space Models, without requiring modifications. In our analysis, we adopt a task-recognition perspective on ICL and examine task vectors (Hendel et al., 2023) induced by the model. We find that pruning enhances the quality of these vectors, suggesting that the pruned neurons previously hindered effective task recognition.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: in context learning,interpretability,memorization

Contribution Types: Model analysis & interpretability

Languages Studied: english

Reassignment Request Area Chair: Yes, I want a different area chair for our submission

Reassignment Request Reviewers: Yes, I want a different set of reviewers

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: N/A

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: References

B2 Discuss The License For Artifacts: N/A

B3 Artifact Use Consistent With Intended Use: N/A

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B5 Documentation Of Artifacts: N/A

B6 Statistics For Data: Yes

B6 Elaboration: 3 + 4 (Method + Experiments)

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: Appendix

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: Appendix

C3 Descriptive Statistics: Yes

C3 Elaboration: Experiments

C4 Parameters For Packages: N/A

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 525

Loading