Abstract: Large Language Models (LLMs) demonstrate strong multilingual capabilities but are costly to deploy due to their size and computational demands. To mitigate this, compression techniques such as pruning and quantization are widely used. However, these methods face two key limitations: (1) they assume access to high-quality instruction or calibration data, which is often unavailable for low-resource languages; and (2) they aim to preserve multilingual generality, making them inefficient for language-specific applications. We introduce LangCompress, a language-aware compression framework that enhances existing compression methods for targeted deployment. LangCompress is method-agnostic and improves state-of-the-art pruning and quantization approaches. It features two core components: an iterative self-supervised pipeline for generating instruction data in the target language, and a vocabulary simplification strategy that reduces the LM head to focus on key tokens. Experiments on perplexity, translation, and summarization tasks show that LangCompress improves performance in the target language. The code and data are publicly available.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: quantization, pruning
Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: German, French, Spanish, Japanese, Vietnamese, Chinese
Reassignment Request Area Chair: This is not a resubmission
Reassignment Request Reviewers: This is not a resubmission
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 4 Experiemnts
B2 Discuss The License For Artifacts: N/A
B3 Artifact Use Consistent With Intended Use: N/A
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: N/A
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 4 Experiments
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Section 4 Experiments
C3 Descriptive Statistics: N/A
C3 Elaboration: Section 4 Experiments
C4 Parameters For Packages: Yes
C4 Elaboration: Section 4 Experiments
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1335
Loading