Keywords: low-light video enhancement, zero-reference method, semantic feature, frequency feature
TL;DR: a zero-reference LLVE method; illumination smoothness; semantic-frequency denoising; multiscale similarity
Abstract: Low-light video enhancement (LLVE) is important for real-world applications where visibility degradation impairs human perception or downstream vision tasks. While zero-reference methods do not need paired image data, they often have flickering problems and struggle to suppress noise while preserving image details. We propose Illumination Smoothness and Semantic-frequency Denoising (IS-SFD), a zero-reference framework for enhancing low-light videos through temporal illumination modeling and denoising guided by semantic and frequency features. To ensure temporal consistency, we introduce a Gated Illumination Estimation Network (GIE-Net) that adaptively fuses multi-frame features by a gating mechanism guided by multiscale similarity of adjacent video frames. For denoising, we design a Semantic-frequency Guided Reflection Denoising Network (SGRD-Net), which combines frequency features from a DWT encoder and semantic features from a frozen CLIP encoder. These features are fused to suppress noise while maintaining structural details in critical areas such as object boundaries. Experiments demonstrate that IS-SFD outperforms existing methods in visual quality and temporal consistency, establishing a new baseline for zero reference LLVE. The code will be made available upon acceptance of the paper.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 19095
Loading