LanPaint: Training-Free Diffusion Inpainting with Asymptotically Exact and Fast Conditional Sampling
Abstract: Diffusion models excel at joint pixel sampling for image generation but lack efficient training-free methods for partial conditional sampling (e.g., inpainting with known pixels). Prior works typically formulate this as an intractable inverse problem, relying on coarse variational approximations, heuristic losses requiring expensive backpropagation, or slow stochastic sampling. These limitations preclude (1) accurate distributional matching in inpainting results, (2) efficient inference modes without gradient, and (3) compatibility with fast ODE-based samplers. To address these limitations, we propose LanPaint: a training-free, asymptotically exact partial conditional sampling method for ODE-based and rectified flow diffusion models. By leveraging carefully designed Langevin dynamics, LanPaint enables fast, backpropagation-free Monte Carlo sampling. Experiments demonstrate that our approach achieves superior performance with precise partial conditioning and visually coherent inpainting across diverse tasks.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We sincerely thank the reviewer for their thorough reading and valuable feedback. In response, we have revised the manuscript as follows:
1. Relocated Section 5 ("Related Works") to Section 2. Shrinked its first subsection. Incorporated additional citations and commentary.
2. Improved the placement of images and tables for better clarity and flow.
3. Added Figure 8 to experimentally demonstrate that the component of the score discarded by the BiG score is negligible.
4. Moved Appendix G, covering implementation details and sensitivity analysis, to Appendix A.
5. Included a finer grid for the step-size ablation study in Table 3.
6. Expanded the discussion on limitations and future work for greater depth.
7. Introduced a new section addressing the broader impact of the work.
8. Added Table 4 to illustrate the performance of LanPaint and other benchmarks under different samplers.
9. Removed the last figure in the appendix.
10. Corrected various typographical ang grammar errors.
EDIT:
11. Named the score decomposition technique used in FLD as the "diffusion damping force" to emphasize how it is related to the diffusion model in section 4.2 (previously section 3.2).
Revisions addressing reviewers' comments are highlighted in red, except for typographical errors.
Assigned Action Editor: ~Jakub_Mikolaj_Tomczak1
Submission Number: 5448
Loading