STARNet: Low-light video enhancement using spatio-temporal consistency aggregation

Zhe Wu

Published: 31 Mar 2025, Last Modified: 12 Nov 2025Pattern RecognitionEveryoneCC BY 4.0

Abstract: In low-light environments, capturing high-quality videos is an imaging challenge due to the limited number of photons. Previous low-light enhancement approaches usually result in over-smoothed details, temporal flickers, and color deviation. We propose STARNet, an end-to-end video enhancement network that leverages temporal consistency aggregation to address these issues. We introduce a spatio-temporal consistency aggregator, which extracts structures from multiple frames in hidden space to overcome detail corruption and temporal flickers. It parameterizes neighboring frames to extract and align consistent features, and then selectively fuses consistent features to restore clear structures. To further enhance temporal consistency, we develop a local temporal consistency constraint with robustness against the warping error from motion estimation. Furthermore, we employ a normalized low-frequency color constraint to regularize the color as the normal-light condition. Extensive experimental results on real datasets show that the proposed method achieves better detail fidelity, color accuracy, and temporal consistency, outperforming state-of-the-art approaches.