Sparse matrix products for neural network compression

Luc Giffon; hachem kadri; Stephane Ayache; Ronan Sicre; thierry artieres

Sparse matrix products for neural network compression

Luc Giffon, hachem kadri, Stephane Ayache, Ronan Sicre, thierry artieres

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Compression, sparsity

Abstract: Over-parameterization of neural networks is a well known issue that comes along with their great performance. Among the many approaches proposed to tackle this problem, low-rank tensor decompositions are largely investigated to compress deep neural networks. Such techniques rely on a low-rank assumption of the layer weight tensors that does not always hold in practice. Following this observation, this paper studies sparsity inducing techniques to build new sparse matrix product layer for high-rate neural networks compression. Specifically, we explore recent advances in sparse optimization to replace each layer's weight matrix, either convolutional or fully connected, by a product of sparse matrices. Our experiments validate that our approach provides a better compression-accuracy trade-off than most popular low-rank-based compression techniques.

One-sentence Summary: The paper explores high rate neural networks compression with factorisation of weight matrices as product of sparse matrices.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=50aEzlqHZzd

11 Replies

Loading