Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Empirical Investigation on Model Capacity and Generalization of Neural Networks for Text
Nov 03, 2017 (modified: Nov 03, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Recently, deep neural network models have shown promising opportunities for many natural language processing (NLP) tasks. In practice, the number of parameters of deep neural models is often significantly larger than the size of the training set, and its generalization behavior cannot be explained by the classic generalization theory. In this paper, with extensive experiments, we empirically investigate the model capacity and generalization of neural models for text. The experiments show that deep neural models can find patterns better than brute-force memorization. Therefore, a large-capacity model with early-stopping stochastic gradient descent (SGD) as implicit regularizer seems to be the best choice, as it has better generalization ability and higher convergence speed.
Keywords:Text, Empirical Investigation, Model Capacity, Generalization Ability, Neural Networks, Deep Learning
Enter your feedback below and we'll get back to you as soon as possible.