Learning long-term dependencies is not as difficult with NARX networksDownload PDFOpen Website

1995 (modified: 11 Nov 2022)NIPS 1995Readers: Everyone
Abstract: It has recently been shown that gradient descent learning algo(cid:173) rithms for recurrent neural networks can perform poorly on tasks that involve long-term dependencies. In this paper we explore this problem for a class of architectures called NARX networks, which have powerful representational capabilities. Previous work reported that gradient descent learning is more effective in NARX networks than in recurrent networks with "hidden states". We show that although NARX networks do not circumvent the prob(cid:173) lem of long-term dependencies, they can greatly improve perfor(cid:173) mance on such problems. We present some experimental 'results that show that NARX networks can often retain information for two to three times as long as conventional recurrent networks.
0 Replies

Loading