ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks

Shuxuan Guo; Jose M. Alvarez; Mathieu Salzmann

ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks

Shuxuan Guo, Jose M. Alvarez, Mathieu Salzmann

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: This paper proposes linear expansion strategies building upon over-parameterization to facilitate practical compact network training.

Abstract: In this paper, we introduce a novel approach to training a given compact network. To this end, we build upon over-parameterization, which typically improves both optimization and generalization in neural network training, while being unnecessary at inference time. We propose to expand each linear layer of the compact network into multiple linear layers, without adding any nonlinearity. As such, the resulting expanded network can benefit from over-parameterization during training but can be compressed back to the compact one algebraically at inference. As evidenced by our experiments, this consistently outperforms training the compact network from scratch and knowledge distillation using a teacher. In this context, we introduce several expansion strategies, together with an initialization scheme, and demonstrate the benefits of our ExpandNets on several tasks, including image classification, object detection, and semantic segmentation.

Keywords: Compact Network Training, Linear Expansion, Over-parameterization, Knowledge Transfer

Original Pdf: pdf

8 Replies

Loading