LangCompress: Language-Aware Compression of Large Language Models

LangCompress: Language-Aware Compression of Large Language Models

ACL ARR 2025 July Submission1335 Authors

29 Jul 2025 (modified: 31 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) demonstrate strong multilingual capabilities but are costly to deploy due to their size and computational demands. To mitigate this, compression techniques such as pruning and quantization are widely used. However, these methods face two key limitations: (1) they assume access to high-quality instruction or calibration data, which is often unavailable for low-resource languages; and (2) they aim to preserve multilingual generality, making them inefficient for language-specific applications. We introduce LangCompress, a language-aware compression framework that enhances existing compression methods for targeted deployment. LangCompress is method-agnostic and improves state-of-the-art pruning and quantization approaches. It features two core components: an iterative self-supervised pipeline for generating instruction data in the target language, and a vocabulary simplification strategy that reduces the LM head to focus on key tokens. Experiments on perplexity, translation, and summarization tasks show that LangCompress improves performance in the target language. The code and data are publicly available.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: quantization, pruning

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: German, French, Spanish, Japanese, Vietnamese, Chinese

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: N/A

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: Section 4 Experiemnts

B2 Discuss The License For Artifacts: N/A

B3 Artifact Use Consistent With Intended Use: N/A

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B5 Documentation Of Artifacts: N/A

B6 Statistics For Data: N/A

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: Section 4 Experiments

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: Section 4 Experiments

C3 Descriptive Statistics: N/A

C3 Elaboration: Section 4 Experiments

C4 Parameters For Packages: Yes

C4 Elaboration: Section 4 Experiments

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 1335

Loading