Minimisation methods for training feedforward neural networks

P. Patrick van der Smagt

1994 (modified: 04 Sept 2020)Neural Networks 1994Readers: Everyone

Abstract: Minimisation methods for training feedforward networks with back propagation are compared. Feedforward neural network training is a special case of function minimisation, where no explicit model of the data is assumed. Therefore, and due to the high dimensionality of the data, linearisation of the training problem through use of orthogonal basis functions is not desirable. The focus is on function minimisation on any basis. Quasi-Newton and conjugate gradient methods are reviewed, and the latter are shown to be a special case of error back propagation with momentum term. Three feedforward learning problems are tested with five methods. It is shown that, due to the fixed stepsize, standard error back propagation performs well in avoiding local minima. However, by using not only the local gradient but also the second derivative of the error function, a much shorter training time is required. Conjugate gradient with Powell restarts shows to be the superior method.

0 Replies