Recurrent Neural Networks are Universal Filters

Sep 25, 2019 Blind Submission readers: everyone Show Bibtex
  • TL;DR: We show that recurrent neural networks can approximate a large class of optimal filters.
  • Abstract: Recurrent neural networks (RNN) are powerful time series modeling tools in ma- chine learning. It has been successfully applied in a variety of fields such as natural language processing (Mikolov et al. (2010), Graves et al. (2013), Du et al. (2015)), control (Fei & Lu (2017)) and traffic forecasting (Ma et al. (2015)), etc. In those application scenarios, RNN can be viewed as implicitly modelling a stochastic dy- namic system. Another type of popular neural network, deep (feed-forward) neural network has also been successfully applied in different engineering disciplines, whose approximation capability has been well characterized by universal approxi- mation theorem (Hornik et al. (1989), Park & Sandberg (1991), Lu et al. (2017)). However, the underlying approximation capability of RNN has not been fully understood in a quantitative way. In our paper, we consider a stochastic dynamic system with noisy observations and analyze the approximation capability of RNN in synthesizing the optimal state estimator, namely optimal filter. We unify the recurrent neural network into Bayesian filtering framework and show that recurrent neural network is a universal approximator of optimal finite dimensional filters under some mild conditions. That is to say, for any stochastic dynamic systems with noisy sequential observations that satisfy some mild conditions, we show that (informal) ∀ > 0, ∃ RNN-based filter, s.t. lim sup x̂ k|k − E[x k |Y k ] < , k→∞ where x̂ k|k is RNN-based filter’s estimate of state x k at step k conditioned on the observation history and E[x k |Y k ] is the conditional mean of x k , known as the optimal estimate of the state in minimum mean square error sense. As an interesting special case, the widely used Kalman filter (KF) can be synthesized by RNN.
  • Keywords: Recurrent Neural Networks, Expressive Power, Deep Learning Theory
  • Original Pdf:  pdf
0 Replies