Does Compression Exacerbate Large Language Models' Social Bias?

Muhammad Athar Ganaie; Mohammed Adnan; Arfa Raja; Shaina Raza; Yani Ioannou

Does Compression Exacerbate Large Language Models' Social Bias?

Muhammad Athar Ganaie, Mohammed Adnan, Arfa Raja, Shaina Raza, Yani Ioannou

Published: 19 Jun 2025, Last Modified: 12 Jul 20254th Muslims in ML Workshop co-located with ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Submission Track: Track 1: Machine Learning Research by Muslim Authors

Keywords: llm, bias, compression, safety, fairness

TL;DR: We explore how compression methods like pruning, quantization, and distillation impact demographic bias in open-weight LLMs (Llama, DeepSeek, and Mistral).

Abstract: Ensuring fairness in large language models (LLMs) is critical, yet the effects of popular compression techniques on social biases remain underexplored. In this work, we systematically investigate how pruning, quantization, and knowledge distillation influence demographic bias in multiple open-weight LLMs. Using the \hbdata dataset, which contains roughly 600 identity descriptors across 13 demographic axes, we employ a likelihood bias metric based on differential perplexity between paired prompts that differ only in demographic terms. Our study covers three representative models: Llama, DeepSeek, and Mistral. The results reveal striking model-dependent behaviors, in some cases suggesting that naive compression can exacerbate stereotypes towards subpopulation groups, and others showing little effect. The findings underscore the necessity of bias-aware compression techniques and rigorous post-compression bias evaluation to ensure the development of fair and responsible AI systems.

Submission Number: 31

Loading