Adaptive Parameter Compression for Language Models

Adaptive Parameter Compression for Language Models

ACL ARR 2024 August Submission458 Authors

16 Aug 2024 (modified: 19 Sept 2024)ACL ARR 2024 August SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: ver the last years, state-of-the-art AI models have grown to a point where their use bears significant economic and environmental cost. At the same time, investigation of NLP models has shown that they are often overparameterized, giving rise to research of compression approaches. Such approaches often suffer the trade-off between hardware requirements and classification performance. In this work, we propose the hardware-independent compression strategy Adaptive Parameter Compression (APC). We extend the Weight Squeezing approach by introducing compression biases and weights, as well as investigating multiple initialization strategies for these weights and the application of APC to transformer model components. Experiments with BERT$_\text{base}$ show the compression's effectiveness, slightly outperforming DistilBERT while being significantly more efficient.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: pruning,distillation

Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency, Theory

Languages Studied: English

Submission Number: 458

Loading