NO DARK DATA REQUIRED: BRIDGING THE GAP BETWEEN NORMAL AND LOW-LIGHT DETECTION VIA RETINEX DECOMPOSITION

ICLR 2026 Conference Submission11093 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Low-light images, Adversarial Visibility Condition, Retinex theory, Image Processing.
TL;DR: We propose a robust Retinex-guided framework that learns illumination-aware features for accurate object detection under low-light and foggy conditions.
Abstract: Conventional low-light object detection approaches typically involve distinct image enhancement modules before the detection process. This can lead to compromised performance due to misaligned objectives and reduced robustness in challenging visual contexts. Many existing methodologies either do not optimize both tasks jointly or overlook significant latent features that are essential for accurate detection. To address this issue, a novel end-to-end framework was proposed that was exclusively trained on normal-light images, eliminating the need for low-light data during the training phase. This approach drew inspiration from the Retinex theory, which separated images into reflectance (representing scene structure) and illumination (indicating lighting conditions). The proposed framework approximates this decomposition within the feature space. The architecture utilises deep multi-scale feature aggregation along with a reflectance-guided fusion pathway, enabling the adaptive integration of illumination-aware representations through element-wise modulation. Despite being trained on normal-light images, the framework demonstrates effective generalisation to low-light and visibility compromised environments. Comprehensive experiments conducted on both synthetic datasets (Pascal VOC) and real-world benchmarks (ExDark, RTTS) indicate that this method achieves enhanced detection accuracy and robustness, particularly in adverse lighting conditions, and outperforms current state-of-the-art techniques.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 11093
Loading