Keywords: Diffusion-based Object Removal, Image Inpainting
Abstract: Existing diffusion-based object removal and inpainting methods often fail to recover the fine structural and textural details of small objects. This is primarily due to the VAE encoder’s downsampling, which inevitably compresses small masked regions and causes significant detail loss, while the decoder’s upsampling alone cannot fully restore the lost fine details.
However, the adverse effects of this fixed compression can be mitigated by enlarging the perspective of these regions.
To this end, we propose ReFocusEraser, a two-stage framework for small object removal that combines camera-adaptive zoom-in inpainting with robust context- and shadow-aware repair. In Stage I, a camera-adaptive refocus mechanism magnifies masked regions, and a LoRA-tuned diffusion model ensures precise semantic alignment for accurate reconstruction. However, reintegrating these magnified inpainted regions into the original image introduces challenges due to VAE asymmetry, such as color shifts and seams. Stage II addresses these issues by fine-tuning an additional decoder to create a seam- and shadow-aware module that eliminates residual artifacts while preserving background consistency.
Extensive experiments demonstrate that our proposed RefocusEraser achieves state-of-the-art performance, outperforming existing methods across benchmark datasets.
Related code and data are available at https://github.com/ProAirVerse/ReFocusEraser.git.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 9570
Loading