Exploring the Trade-Off between Model Complexity and Numerical Precision for Efficient Edge AI Inference

Yannick Malot; Thomas Mesquida

Exploring the Trade-Off between Model Complexity and Numerical Precision for Efficient Edge AI Inference

Yannick Malot, Thomas Mesquida

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Neural networks, Edge AI, Artificial Intelligence, Model compression

TL;DR: We study whether it is preferable to lower the number of parameters or their numerical precision, when seeking to compress a neural network.

Abstract: When considering the compression of neural networks, the adoption of low-bit representations for both parameters and activations has demonstrated significant efficacy. The process of learning quantized weights through Quantization Aware Training (QAT) stands out as a powerful means to substantially diminish the memory requirements for a specific model to efficiently perform inference. However, despite the numerous works reporting the gains achieved using QAT, a comparison with a notably simpler technique - reducing the model's complexity using fewer parameters - is often absent. In this paper, we attemp to answer a seemingly simple question: to reduce a given model's storage requirements, is it better to reduce the number of parameters in the model or to reduce the numerical precision? We explore the trade-off between the dimensionality of parameters and activations one can afford to keep in memory, and the numerical precision used to represent them. Through our experiments in image classification, keyword spotting and language modelling, our results suggest that quantizing weights to $2$ bits and keeping a high number of parameters seems optimal, regardless of the task considered and model architecture.

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6251

Loading