Learning Longer-term Dependencies in RNNs with Auxiliary Losses

Trieu H. Trinh, Andrew M. Dai, Minh-Thang Luong, Quoc V. Le

Feb 12, 2018 ICLR 2018 Workshop Submission readers: everyone Show Bibtex
  • Abstract: We present a simple method to improve learning long-term dependencies in recurrent neural networks (RNNs) by introducing unsupervised auxiliary losses. These auxiliary losses force RNNs to either remember distant past or predict future, enabling truncated backpropagation through time (BPTT) to work on very long sequences. We experimented on sequences up to 16000 tokens long and report faster training, more resource efficiency and better test performance than full BPTT baselines such as Long Short Term Memory (LSTM) networks or Transformer.
  • Keywords: Deep Learning, Semi-supervised Learning, Unsupervised Learning, Long-term dependencies, Recurrent Neural Networks, Auxiliary Losses
  • TL;DR: Combining auxiliary losses and truncated backpropagation through time in RNNs improves resource efficiency, training speed and generalization in learning long term dependencies.
0 Replies

Loading