Chip-Tuning: Build Efficient Classification Models with Probing

ACL ARR 2025 February Submission851 Authors

11 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Rapid performance development of large language models (LLMs) is accompanied by an increase in model size, leading to an increasing cost of model training and inference. Previous research has discovered that certain layers in LLMs exhibit redundancy, and removing these layers brings marginal loss in model performance. In this paper, we propose chip-tuning, a simple and effective framework for building efficient classification models with probing. Chip-tuning attaches tiny probing classifiers named chips to different layers of LLMs, and trains chips with parameters in the backbone model frozen. After selecting a chip for classification, all layers after the attached layer could be removed with marginal performance loss. Experimental results on various LLMs and datasets demonstrate that chip-tuning significantly outperforms previous state-of-the-art baselines in both accuracy and pruning ratio, achieving a pruning ratio of up to 50\%. We also find that chip-tuning could be applied on multimodal models, and could be combined with model finetuning, proving its excellent compatibility.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: pruning, parameter-efficient-training
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English, Chinese
Submission Number: 851
Loading