Keywords: Regularization, Hessian Trace, Stochastic Estimator, Nonlinear Dynamical System, Generalization Error
Abstract: In this paper we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. Hutchinson method is a classical unbiased estimator for the trace of a matrix, but it is very time-consuming on deep learning models. Hence a dropout scheme is proposed to efficiently implements the Hutchinson method. Then we discuss a connection to linear stability of a nonlinear dynamical system. Experiments demonstrate that our method outperforms existing regularizers such as Jacobian, confidence penalty, and label smoothing. Our regularization method is also orthogonal to data augmentation methods, achieving the best performance when our method is combined with data augmentation.
15 Replies
Loading