Discrete Inversion: A Controllable Latent Space for Multinomial Diffusion and Masked Generative Models
Keywords: Masked Generative Modeling; Discrete Diffusion Model; Multinomial Diffusion
Abstract: Discrete diffusion models have achieved notable success in tasks like image generation and masked language modeling, yet they face limitations in controlled content editing. This paper introduces {\bf Discrete Inversion}, the first approach to enable precise inversion for discrete diffusion models, including multinomial diffusion and masked generative models. By recording noise sequences and masking patterns during the forward diffusion process, Discrete Inversion facilitates accurate reconstruction and controlled edits without the need for predefined masks or attention map manipulation. We demonstrate the effectiveness of our method across both image and text domains, evaluating it on models like VQ-Diffusion, Paella, and RoBERTa. Our results show that Discrete Inversion not only preserves high fidelity in the original data but also enables flexible and user-friendly editing in discrete spaces, significantly advancing the capabilities of discrete generative models.
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5811
Loading