CoDe: Semantic Color Reasoning and High-Fidelity Detail Synthesis for Underwater Image Enhancement

Xiaotong Luo; Mingdian Liu; Zhen Dong; Bisheng Yang; Bing WANG

CoDe: Semantic Color Reasoning and High-Fidelity Detail Synthesis for Underwater Image Enhancement

Xiaotong Luo, Mingdian Liu, Zhen Dong, Bisheng Yang, Bing WANG

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion model, Underwater image enhancement, VLM

Abstract: Underwater images often suffer from severe color distortion and texture degradation due to light absorption and scattering, posing huge challenges for visual perception and restoration. Recent diffusion-based underwater image enhancement (UIE) methods have shown remarkable performance, but most rely on customized architectures trained from scratch or lack auxiliary guidance beyond image-level inputs, which limit the model generalization and controllability. In this work, we propose a semantic Color reasoning and high-fidelity Detail synthesis UIE framework (CoDe), which fully leverages the synergy of diffusion models and vision-language models. It explicitly disentangles color and texture of underwater images: a fine-tuned LLaVA provides domain-invariant semantic color cues for robust color correction, while an SDXL-based generator restores high-frequency details for sharp reconstruction. Furthermore, we design an adaptive degradation-aware feature modulation module that fuses underwater and clean-domain representations, effectively suppressing noise interference during the denoising diffusion process. Extensive experiments on multiple underwater benchmarks demonstrate that CoDe achieves superior performance, significantly improving both color fidelity and texture preservation.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 5440

Loading