TEncDM: Understanding the Properties of Diffusion Model in the Space of Language Model EncodingsDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Drawing inspiration from the success of diffusion models in various domains, numerous research papers proposed methods for adapting diffusion models to the text domain. Despite these efforts, none of them has managed to achieve the quality of large language models. In this paper, we conduct a comprehensive analysis of key components of the text diffusion models and introduce a novel approach named Text Encoding Diffusion Model (TEncDM). Instead of the commonly used token embedding space, we train our model in the space of the language model encodings. Additionally, we propose to use a Transformer-based decoder that utilizes contextual information for text reconstruction. We also analyse self-conditioning and find that it increases the magnitude of the model outputs, allowing the reduction of the number of denoising steps at the inference stage. Evaluation of TEncDM on two downstream text generation tasks, QQP and XSum, demonstrates its superiority over existing non-autoregressive models.
Paper Type: long
Research Area: Generation
Contribution Types: Model analysis & interpretability
Languages Studied: English
0 Replies

Loading