Addressing the Training-Inference Discrepancy in Discrete Diffusion for Text Generation

Masaki Asada, Makoto Miwa

Published: 01 Jan 2025, Last Modified: 09 Sept 2025COLING 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This study addresses the discrepancy between training and inference in discrete diffusion models for text generation. We propose two novel strategies: (1) a training schema that considers two-step diffusion processes, allowing the model to use its own predicted output as input for subsequent steps during training and (2) a scheduling technique that gradually increases the probability of using self-generated text as training progresses. Experiments conducted on four widely used text generation benchmark datasets demonstrate that both proposed strategies improve the performance of discrete diffusion models in text generation.

External IDs:dblp:conf/coling/AsadaM25a