Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Neural Networks with Block Diagonal Inner Product Layers
Nov 03, 2017 (modified: Nov 03, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Two difficulties continue to burden deep learning researchers and users: (1) neural networks are cumbersome tools that grow with the complexity of the learning problem, and (2) the activity of the fully connected, or inner product, layers remains mysterious. We make contributions to these two issues by considering a modified version of the fully connected layer we call a block diagonal inner product layer. These modified layers have weight matrices that are block diagonal, turning a single fully connected layer into a set of densely connected neuron groups. This method condenses network storage and speeds up the run time without significant adverse effect on the testing accuracy, thus offering a new approach to solving the first problem. Comparing the change in variance and singular values of the weights through training in a layer when varying the number of blocks gives insight into the second problem. The ratio of the variance of the weights remains constant throughout training. That is, the relationship in structure is preserved in the final parameter distribution. We observe that trained inner product layers have structure similar to that of truly random matrices with iid entries, and that each block in a block inner product layer behaves like a smaller copy, giving a better understanding of the nature of inner product layers.
TL;DR:We look at neural networks with block diagonal inner product layers for efficiency and offer some analysis.
Keywords:Deep Learning, Neural Networks, Random Matrix Theory
Enter your feedback below and we'll get back to you as soon as possible.