Dynamic Alignment of Representations for Enhanced Chain-of-Thought Reasoning in Large Language Models

Chenxi Huang; Liang Xie; Chen Shen; Shaotian Yan; Sinan Fan; Zhihong Gu; Binbin Lin; Deng Cai; Jieping Ye

Dynamic Alignment of Representations for Enhanced Chain-of-Thought Reasoning in Large Language Models

Chenxi Huang, Liang Xie, Chen Shen, Shaotian Yan, Sinan Fan, Zhihong Gu, Binbin Lin, Deng Cai, Jieping Ye

26 Sept 2024 (modified: 18 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models; LLM reasoning; LLM COT; PEFT

Abstract: Representations encode rich semantic information, implying that editing them could serve as a effective tool (i.e., DAS, REFT) for parameter-efficient finetuning (PEFT). However, existing approaches typically focus on general categories of representations or selecting an appropriate number of continuous representations for each datasets, which limits their adaptability and performance. In contrast, our method dynamically selects representations requiring intervention at the instance level, referred to as misaligned representations, which are characterized by a lack of semantic information or appropriate attention. Identifying these misaligned representations poses challenging, as they serve different roles in varying contexts. It is evident that crucial representations, which are those that primarily receive information flow from themselves or significantly influence other representations, are likely to encompass misaligned representations. Consequently, we simplify the task by pivot our focus to crucial representations and aim to accurately locate them. We adaptively update crucial representation amidst uncertainty, freezing the base model while learning an updated direction for each layer. Involving both identification and updating of representations, we present a PEFT method, termed Dynamic Alignment of Representations (DAR). We validate the effectiveness of our method on eight diverse datasets across two scenarios, arithmetic and commonsense, and three base models: LLaMA-2-7B, LLaMA-2-13B, and LLaMA-3-8B. Notably, our method yields improvements of 17.47% and 3.11% over LLaMA-2-7B and ReFT on the GSM8K dataset, respectively. Additionally, it requires only 51 times fewer parameters than LoRA, demonstrating significant parameter efficiency. Furthermore, our method can be easily extended to few-shot learning.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6740

Loading