Unified Scaling Laws for Compressed Representations

Published: 11 Jun 2025, Last Modified: 10 Jul 2025ES-FoMo IIIEveryoneRevisionsBibTeXCC BY 4.0
Keywords: scaling laws, large language models, model compression, quantization, sparsity
TL;DR: We propose a unified scaling law that accurately predicts performance across sparse and quantized models using a single capacity metric, which can be derived from the representation alone.
Abstract: Scaling laws have shaped recent advances in machine learning by predicting model performance based on model size, computation, and data. Concurrently, the rise in computational cost for AI has motivated model compression techniques, notably quantization and sparsification, have become essential for large-scale training and inference. This paper investigates the interplay between scaling laws and compression formats, exploring whether a unified scaling framework can accurately predict model performance when training occurs over various compressed representations, such as sparse, scalar-quantized, or sparse-quantized. We validate a general scaling law formulation and show that it is applicable both individually but also composably across compression types. Our main result is demonstrating that there exists a simple ``capacity'' metric—based on to fitting random Gaussian data—which can robustly predict parameter efficiency across multiple representations.
Submission Number: 109
Loading