Abstract: Recurrent models for sequences have been recently successful at many tasks, especially for language modeling
and machine translation. Nevertheless, it remains challenging to extract good representations from
these models. For instance, even though language has a clear hierarchical structure going from characters
through words to sentences, it is not apparent in current language models.
We propose to improve the representation in sequence models by
augmenting current approaches with an autoencoder that is forced to compress
the sequence through an intermediate discrete latent space. In order to propagate gradients
though this discrete representation we introduce an improved semantic hashing technique.
We show that this technique performs well on a newly proposed quantitative efficiency measure.
We also analyze latent codes produced by the model showing how they correspond to
words and phrases. Finally, we present an application of the autoencoder-augmented
model to generating diverse translations.
TL;DR: Autoencoders for text with a new method for using discrete latent space.
Keywords: autoencoders, sequence models, discrete representations
Code: [![github](/images/github_icon.svg) tensorflow/tensor2tensor](https://github.com/tensorflow/tensor2tensor) + [![Papers with Code](/images/pwc_icon.svg) 1 community implementation](https://paperswithcode.com/paper/?openreview=BkQCGzZ0-)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/discrete-autoencoders-for-sequence-models/code)
8 Replies
Loading