Unpaired Photo-realistic Image Deraining with Energy-informed Diffusion Model

Published: 20 Jul 2024, Last Modified: 06 Aug 2024MM2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Existing unpaired image deraining approaches face challenges in accurately capture the distinguishing characteristics between the rainy and clean domains, resulting in residual degradation and color distortion within the reconstructed images. To this end, we propose an energy-informed diffusion model for unpaired photo-realistic image deraining (UPID-EDM). Initially, we delve into the intricate visual-language priors embedded within the contrastive language-image pre-training model (CLIP), and demonstrate that the CLIP priors aid in the discrimination of rainy and clean images. Furthermore, we introduce a dual-consistent energy function (DEF) that retains the rain-irrelevant characteristics while eliminating the rain-relevant features. This energy function is trained by the non-corresponding rainy and clean images. In addition, we employ the rain-relevance discarding energy function (RDEF) and the rain-irrelevance preserving energy function (RPEF) to direct the reverse sampling procedure of a pre-trained diffusion model, effectively removing the rain streaks while preserving the image contents. Extensive experiments demonstrate that our energy-informed model surpasses the existing unpaired learning approaches in terms of both supervised and no-reference metrics.
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: Our work explores a novel paradigm that employs the multimodal language-image pre-training model and the pre-trained diffusion model to achieve unpaired image deraining, which is more practical than the existing paired image deraining approaches. By effectively removing rain-induced artifacts from images without requiring paired rainy-clean datasets, our work facilitates the enhancement of image quality and fidelity across diverse multimedia applications. Additionally, deraining can aid in the creation of high-quality training datasets for multimodal processing tasks by providing cleaner input images, thus improving the robustness and accuracy of multimodal models. Overall, our work is an innovative application of multimodal technology in unpaired image rain removal, and our unpaired learning strategy has better practicability than the existing methods.
Supplementary Material: zip
Submission Number: 1010
Loading