Recurrent Convolutions: A Model Compression Point of View

Zhendong Zhang; Cheolkon Jung

Recurrent Convolutions: A Model Compression Point of View

Zhendong Zhang, Cheolkon Jung

Published: 07 Nov 2018, Last Modified: 05 May 2023NIPS 2018 Workshop CDNNRIA Blind SubmissionReaders: Everyone

Abstract: Recurrent convolution (RC) shares the same convolutional kernels and unrolls them multiple times, which is originally proposed to model time-space signals. We suggest that RC can be viewed as a model compression strategy for deep convolutional neural networks. RC reduces the redundancy across layers and is complementary to most existing model compression approaches. However, the performance of an RC network can't match the performance of its corresponding standard one, i.e. with the same depth but independent convolutional kernels. This reduces the value of RC for model compression. In this paper, we propose a simple variant which improves RC networks: The batch normalization layers of an RC module are learned independently (not shared) for different unrolling steps. We provide insights on why this works. Experiments on CIFAR show that unrolling a convolutional layer several steps can improve the performance, thus indirectly plays a role in model compression.

TL;DR: Recurrent convolution for model compression and a trick for training it, that is learning independent BN layres over steps.

Keywords: recurrent convolution, model compression, batch normalization

13 Replies

Loading