A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property

Jianshu Chen, Li Deng

Dec 20, 2013 (modified: Dec 20, 2013) ICLR 2014 conference submission readers: everyone
  • Decision: submitted, no decision
  • Abstract: We present an architecture of a recurrent neural network (RNN) with a fully-connected deep neural network (DNN) as its feature extractor. The RNN is equipped with both causal temporal prediction and non-causal look-ahead, via auto-regression (AR) and moving-average (MA), respectively. The focus of this paper is a primal-dual training method that formulates the learning of the RNN as a formal optimization problem with an inequality constraint that guarantees stability of the network dynamics. Experimental results demonstrate the effectiveness of this new method, which achieves 18.86% phone recognition error on the TIMIT benchmark for the core test set. The results also show that the proposed primal-dual training method produces lower recognition errors than the popular RNN methods developed earlier based on the carefully tuned threshold parameter that heuristically prevents the gradient from exploding.

Loading