Keywords: instruction finetuning, large language models
Abstract: Pretrained Large Language Models (LLMs) require post-training methods such as supervised fine-tuning (SFT) on instruction-response pairs to enable instruction following.
However, this process can potentially harm existing capabilities learned during pretraining.
In this paper, we investigate the loss of context awareness after SFT, defined as the capability to extract and understand information from the user-provided context and respond accordingly.
We are the first to identify and show that the loss of context-awareness appears on instruction-finetuned LLMs when the chat template is applied to the input prompts.
We identify the performance decline is partially caused by the bias embedded into the chat template to focus less on the the user-provided context.
Based on these observations, we propose two methods to mitigate the loss of context awareness in instruct models: post hoc attention steering on user prompts and conditional instruction fine-tuning with a context-dependency indicator.
Empirical experiments on 4 context-dependent downstream tasks and 3 pretrained LLMs of different sizes show that our methods can effectively mitigate the loss of context awareness without compromising the general ability of instruction following.
Our findings also strongly advocate the necessity to benchmark context awareness after instruction fine-tuning carefully.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9063
Loading