OscillationInversion: Understand the structure of Large Flow Model through the Lens of Inversion Method
Keywords: diffusion models; image generation
Abstract: We investigate oscillation phenomena observed in inversion methods applied to large text-to-image diffusion models, particularly the ``Flux'' model. Using a fixed-point-inspired iteration method to invert real-world images, we find that the solution does not converge but instead oscillates between distinct clusters. Our results, validated both on real diffusion models and toy experiments, show that these oscillated clusters exhibit significant semantic coherence.
We propose that this phenomenon arises from oscillatory solutions in dynamic systems, linking it to the structure of rectified flow models. The oscillated clusters serve as local latent distributions that allow for effective semantic-based image optimization.We provide theoretical insights, linking these oscillations to fixed-point dynamics and proving conditions for stable cluster formation and differentiation in flow models.
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3533
Loading