DRIP: Unleashing Diffusion Priors for Joint Foreground and Alpha Prediction in Image Matting

Published: 25 Sept 2024, Last Modified: 14 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Image Matting, Diffusion Model, Foreground Estimation
Abstract: Recovering the foreground color and opacity/alpha matte from a single image (i.e., image matting) is a challenging and ill-posed problem where data priors play a critical role in achieving precise results. Traditional methods generally predict the alpha matte and then extract the foreground through post-processing, often failing to produce high-fidelity foreground color. This failure stems from the models' difficulty in learning robust color predictions from limited matting datasets. To address this, we explore the potential of leveraging vision priors embedded in pre-trained latent diffusion models (LDM) for estimating foreground RGBA values in challenging scenarios and rare objects. We introduce Drip, a novel approach for image matting that harnesses the rich prior knowledge of LDM models. Our method incorporates a switcher and a cross-domain attention mechanism to extend the original LDM for joint prediction of the foreground color and opacity. This setup facilitates mutual information exchange and ensures high consistency across both modalities. To mitigate the inherent reconstruction errors of the LDM's VAE decoder, we propose a latent transparency decoder to align the RGBA prediction with the input image, thereby reducing discrepancies. Comprehensive experimental results demonstrate that our approach achieves state-of-the-art performance in foreground and alpha predictions and shows remarkable generalizability across various benchmarks.
Primary Area: Diffusion based models
Submission Number: 8775
Loading