Isotropic Contextual Representations through Variational RegularizationDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Abstract: Contextual language representations achieve state-of-the-art performance across various natural language processing tasks. However, these representations have been shown to suffer from the degeneration problem, i.e. they occupy a narrow cone in the latent space. This problem can be addressed by enforcing isotropy in the latent space. In analogy to variational autoencoders, we suggest applying a token-level variational loss to a Transformer architecture and introduce the prior distribution's standard deviation as model parameter to optimize isotropy. The encoder-decoder architecture allows for learning interpretable embeddings that can be decoded into text again. Extracted features at sentence-level achieve competitive results on benchmark classification tasks.
Supplementary Material: zip
17 Replies

Loading