Keywords: transformer, VAE
TL;DR: We make a VAE-decoder transformer
Abstract: We propose an extension of the decoder Transformer that conditions its
generative process on random latent variables. Those variables are
learned without supervision thanks to a variational
procedure. Experimental evaluations show that allowing such a
conditioning translates into substantial improvements on downstream
tasks.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 3976
Loading