Keywords: large language models, internal structure, parameter-efficient fine-tuning
Abstract: In recent years, the performance of large language models (LLMs) on reasoning tasks has been remarkable, even surpassing human capabilities on various benchmarks. However, there remains a lack of clear understanding in the academic community regarding how the structure and internal parameters of LLMs progressively solve complex reasoning problems.
In this study, we investigate the inference process of LLMs on cross-linguistic materials and propose the hypothesis that LLM layers exhibit a structured division of labor across conceptualization, reasoning, and textualization. Conceptualization layers are crucial for transforming natural language inputs into abstract representations within the LLM, while reasoning layers play a key role in reasoning over these abstract concepts. Finally, the textualization layers convert these abstract representations back into natural language. Based on this hypothesis, we propose a novel approach, LIFT, to achieves efficient and effective finetuning by selectively finetuning only those layers most relevant to a given task’s functionality. We then conduct extensive experiments to show that the LIFT method not only accelerates the training process but also significantly improves model performance.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 10759
Loading