Dynamic Cross-Layer Prefix Alignment for Resolving Label Preference Discrepancies in LLMs Fine-Tuning

Jiayu Zhang; Liwei Hou; Yinan Peng; Canran Xiao; Weihao Luo

Dynamic Cross-Layer Prefix Alignment for Resolving Label Preference Discrepancies in LLMs Fine-Tuning

Jiayu Zhang, Liwei Hou, Yinan Peng, Canran Xiao, Weihao Luo

19 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Label preference discrepancies, Cross-layer prefix sharing, Large language model fine-tuning

Abstract: Fine-tuning large language models (LLMs) to adapt them for specialized downstream tasks is a common practice, yet existing methods overlook a critical issue: label preference discrepancies among different annotators. Such inconsistencies in labeling can significantly impair the model's robustness and generalization. In this work, we propose Dynamic Cross-Layer Preference Correction (DCPC), a novel self-supervised learning framework designed to mitigate these inconsistencies. DCPC incorporates a preference-sensitive similarity mechanism, cross-layer prefix alignment, and a Preference Correction Module (PCM) to dynamically adjust embeddings across transformer layers. By leveraging self-supervision, DCPC effectively aligns semantic representations and ensures consistency in label predictions, even in the presence of preference shifts. We evaluate DCPC across multiple tasks using prominent base models and introduce modified datasets that simulate real-world preference shifts. Our results show that DCPC consistently outperforms state-of-the-art Parameter-Efficient Fine-Tuning (PEFT) methods in handling label preference discrepancies.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1918

Loading