Time-variant Duo-image Inpainting via Interactive Distribution Transition Estimation

Yun Xing; Qing Guo; Xiaoguang Li; Yihao Huang; Xiaofeng Cao; Di Lin; Ivor Tsang; Lei Ma

Time-variant Duo-image Inpainting via Interactive Distribution Transition Estimation

Yun Xing, Qing Guo, Xiaoguang Li, Yihao Huang, Xiaofeng Cao, Di Lin, Ivor Tsang, Lei Ma

16 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Image Inpainting, Diffusion Sampling

Abstract: In this work, we focus on a novel and practical task, i.e., Time-vAriant duo-iMage inPainting (TAMP). The aim of TAMP is to inpaint two damaged images by leveraging their complementary information, where the two images are captured at the same scene with a significant time gap between them, i.e., time-variant duo-image. Different from existing reference-guided image inpainting, TAMP considered the potential pixel damage and content mismatch of reference images when they are collected from the Internet for real-world applications. In particular, our study finds that even state-of-the-art (SOTA) reference-guided image inpainting methods fail to address this task due to inappropriate image complementation. To address the issue, we propose a novel Interactive Transition Distribution Estimation (ITDE) module that interactively complements the duo-image with semantic consistency and provides refined inputs for the consequent image inpainting process. The designed ITDE is inpainting pipeline independent making it a plug-and-play image complement module. Thus, we further propose the Interactive Transition Distribution-driven Diffusion (ITDiff) model, which integrated ITDE with a SOTA diffusion model, as our final solution for TAMP. Moreover, considering the lack of benchmarks for TAMP task, we newly assembled a dataset, i.e., TAMP-Street, based on existing image and mask datasets. We conduct experiments on our TAMP-Street and conventional DPED50k datasets which show our methods consistently outperform SOTA reference-guided image inpainting methods for solving TAMP.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1010

Loading