Keywords: Compression, Singular Learning Theory, Minimum Description Length, Quantization
TL;DR: We extend the minimum description length principle to neural networks, and use it to study compressibility of Pythia models
Abstract: We study neural network compressibility by using singular learning theory to extend the minimum description length (MDL) principle to singular models like neural networks. Through extensive experiments on the Pythia suite with quantization, factorization, and other compression techniques, we find that complexity estimates based on the local learning coefficient (LLC) are closely and, in some cases, linearly correlated with compressibility. Our results provide a path toward rigorously evaluating the limits of model compression.
Primary Area: learning theory
Submission Number: 19803
Loading