Keywords: LSTMs, learning dynamics, language modeling
TL;DR: LSTMs learn long-range dependencies compositionally by building them from shorter constituents over the course of training.
Abstract: LSTM-based language models exhibit compositionality in their representations, but how this behavior emerges over the course of training has not been explored. Analyzing synthetic data experiments with contextual decomposition, we find that LSTMs learn long-range dependencies compositionally by building them from shorter constituents during training.
1 Reply
Loading