Low-light Video Enhancement with Conditional Diffusion Models and Wavelet Interscale Attentions

Ruirui Lin, Qi Sun, Nantheera Anantrasirichai

Published: 18 Nov 2024, Last Modified: 12 Nov 2025CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Videos captured in low-light conditions often suffer from various distortions, such as noise, low contrast, color imbalance, and blur. Consequently, a post-processing workflow is necessary but typically time-consuming. Developing AI-based tools for videos also requires significantly more computational resources compared to those for images. This paper introduces a novel framework aimed at reducing memory usage and computational time by enhancing videos in the wavelet domain. The framework utilizes conditional diffusion models to enhance brightness and adjust colors in the low-pass subbands while employing interscale-attention mechanisms to enhance sharpness in the high-pass subbands. To ensure temporal consistency, we integrate feature alignment and fusion into the denoiser of the diffusion models. Additionally, we introduce adaptive brightness adjustment as a preprocessing module to reduce the workload of the learnable networks. Experimental results demonstrate that our proposed methods outperform existing low-light video enhancement techniques with competitive inference times compared to image-based methods.

External IDs:doi:10.1145/3697294.3697304