- Abstract: In this paper we study the problem of learning the weights of a deep convolutional neural network. We consider a network where convolutions are carried out over non-overlapping patches with a single kernel in each layer. We develop an algorithm for simultaneously learning all the kernels from the training data. Our approach dubbed Deep Tensor Decomposition (DeepTD) is based on a rank-1 tensor decomposition. We theoretically investigate DeepTD under a realizable model for the training data where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted convolutional kernels. We show that DeepTD is data-efficient and provably works as soon as the sample size exceeds the total number of convolutional weights in the network. Our numerical experiments demonstrate the effectiveness of DeepTD and verify our theoretical findings.
- Keywords: convolutional neural network, tensor decomposition, sample complexity, approximation
- TL;DR: We consider a simplified deep convolutional neural network model. We show that all layers of this network can be approximately learned with a proper application of tensor decomposition.