LLM Size Reduction and Carbon Footprint

Published: 31 Oct 2025, Last Modified: 31 Oct 2025BNAIC/BeNeLearn 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Type E (Late-Breaking Abstracts)
Keywords: LLM compression, quantization, energy efficiency, carbon footprint
Abstract: Compression techniques like quantization reduce memory and speed up inference for LLMs, but their environmental impact during inference is underexplored. This study quantifies how 4-bit quantization affects performance and CO$_2$-equivalent emissions across hardware and electricity mixes using LLaMA-7B/30B and Mistral-7B-v0.3/Small 3. Results show negligible accuracy loss but hardware-dependent energy effects (-39% to +26%) and strong geographic dependence: compressed models in carbon-intensive grids can emit up to 6$\times$ more CO$_2$eq than uncompressed models in low-carbon grids. These findings link compression, hardware efficiency, and grid context, advocating for carbon-aware LLM deployment.
Submission Number: 85
Loading