Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Understanding Deep Learning Generalization by Maximum Entropy
Guanhua Zheng, Jitao Sang, Changsheng Xu
Feb 15, 2018 (modified: Feb 15, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Deep learning achieves remarkable generalization capability with overwhelming number of model parameters. Theoretical understanding of deep learning generalization receives recent attention yet remains not fully explored. This paper attempts to provide an alternative understanding from the perspective of maximum entropy. We first derive two feature conditions that softmax regression strictly apply maximum entropy principle. DNN is then regarded as approximating the feature conditions with multilayer feature learning, and proved to be a recursive solution towards maximum entropy principle. The connection between DNN and maximum entropy well explains why typical designs such as shortcut and regularization improves model generalization, and provides instructions for future model development.
TL;DR:We prove that DNN is a recursively approximated solution to the maximum entropy principle.
Keywords:generalization, maximum entropy, deep learning
Enter your feedback below and we'll get back to you as soon as possible.