ISET: Invertible Sequence Embedding from Transformer
Keywords: Variable-length sentences, Fixed-length representations, NLP, Pre-trained language models, BERT, Invertible embeddings, Sentence-level embeddings, Semantic representation, Generative Pre-trained Sequence (GPS).
TL;DR: Enhancing invertible sentence embeddings is a key focus in NLP.
Abstract: Sequence embeddings are essential in Natural Language Processing (NLP) as they convert variable-length sentences into fixed-length representations suitable for deep learning tasks. Despite the achievements of pre-trained language models like BERT, continuous efforts are directed toward enhancing sentence embeddings through contrastive learning methods. Furthermore, the distinction between semantic and invertible embeddings underlines the need for representations that not only capture meaning but also allow the reconstruction of the original sentence. Before the emergence of invertible embeddings, the Transformer architecture lacked readily available methods to achieve this goal. In response, a pioneering approach was introduced to create invertible sentence-level embeddings within the Transformer framework. To achieve contextualized sequence embeddings, a Generative Pre-trained Sequence (GPS) is proposed, which predicts the following sequences from previous sequences, comprising four steps: transformer-based symbol embeddings (optional), sequence-wise aggregation, GPS pretraining, and coupled GPS pretraining. This innovative method sought to leverage the strengths of Transformers while addressing the imperative need for invertibility in sentence embeddings. Doing so paved the way for applications that require both semantic representation and the ability to reconstruct the original sentences.
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7835
Loading