Submission Track: Track 1: Machine Learning Research by Muslim Authors
Keywords: llm, bias, compression, safety, fairness
TL;DR: We explore how compression methods like pruning, quantization, and distillation impact demographic bias in open-weight LLMs (Llama, DeepSeek, and Mistral).
Abstract: Ensuring fairness in large language models (LLMs) is critical, yet the effects of popular compression techniques on social biases remain underexplored. In this work, we systematically investigate how pruning, quantization, and knowledge distillation influence demographic bias in multiple open-weight LLMs. Using the \hbdata dataset, which contains roughly 600 identity descriptors across 13 demographic axes, we employ a likelihood bias metric based on differential perplexity between paired prompts that differ only in demographic terms. Our study covers three representative models: Llama, DeepSeek, and Mistral. The results reveal striking model-dependent behaviors, in some cases suggesting that naive compression can exacerbate stereotypes towards subpopulation groups, and others showing little effect. The findings underscore the necessity of bias-aware compression techniques and rigorous post-compression bias evaluation to ensure the development of fair and responsible AI systems.
Submission Number: 31
Loading