Keywords: Object Removal, Object-Effect Attention
Abstract: Object removal requires eliminating not only the target object but also its associated visual effects, such as shadows and reflections. However, diffusion-based inpainting methods often produce artifacts, hallucinate contents, alter background, and struggle to remove object effects accurately. To overcome these limitations, we present a new dataset for OBject-Effect Removal, named OBER, which provides paired images with and without object-effects, along with precise masks for both objects and their effects. The dataset comprises high-quality captured and simulated data, covering diverse objects, effects, and complex multi-object scenes. Building on OBER, we propose a novel framework, ObjectClear, which incorporates an object-effect attention mechanism to guide the model toward the foreground removal regions by learning attention masks, effectively decoupling foreground removal from background reconstruction. Furthermore, the predicted attention map enables an attention-guided fusion strategy at inference, greatly preserving background details. Extensive experiments demonstrate that ObjectClear outperforms existing methods, achieving superior object-effect removal quality and background fidelity, especially in challenging real-world scenarios.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 2700
Loading