Modular Fine-Tuning of Clustering:Directional Updating of Weight Parameters for PLMs

ICLR 2026 Conference Submission6811 Authors

16 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: PLM, Modular Fine-Tuning, NAS
Abstract: With the widespread adoption of pre-trained language models (PLMs) and the pre-training-fine-tuning paradigm, studies have shown that increasing model scale often leads to performance improvements, yet it also significantly raises the costs of training and storage. Current mainstream approaches, such as LoRA and knowledge distillation, aim to reduce computational overhead by decreasing the number of tunable parameters while preserving model performance as much as possible. To achieve a better balance between performance and efficiency, and inspired by neural architecture search, this paper proposes a modular parameter fine-tuning method - MFTC. Existing research has indicated that, during fine-tuning on downstream tasks, parameters with higher magnitudes in PLMs tend to lie in a low-dimensional space. Building on this insight, we construct a dynamic modular parameter space and adopt a modular fine-tuning strategy to identify and prioritize the optimization of these critical parameters. Specifically, we introduce a dynamic spectral clustering algorithm to identify task-relevant subsets of parameters and encapsulate them into functionally independent modules. Subsequently, neural architecture search is employed to select modules with diverse representational capacities, which are then assembled into a high-performance fine-tuned model. Experimental results on multiple mainstream benchmark datasets demonstrate that the proposed modular fine-tuning approach can significantly reduce energy consumption while effectively enhancing the fine-tuning performance of pre-trained language models on downstream tasks.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 6811
Loading