Keywords: camera response model, camera response function, brightness transformation function, dual branch contrastive encoder
Abstract: Visual perception in the wild have demonstrated transformative potential across a wide range of applications, spanning from planetary exploration to deep-sea monitoring missions. However, a fundamental challenge remains in enabling visual perception enhancement that can explicitly extract rules and support interactive, precise manipulation in unknown, dynamic environments—particularly under conditions of large scale data absence, heterogeneous data distribution, and without the supervision of annotated images. Our approach introduces a differentiable physics framework that unifies the camera response model (CRM) with deep learning to achieve visual perception enhancement under multiple degradation conditions. Specifically, grounded in fundamental principles of radiation physics, we formulate the camera response function (CRF) calibration as a constrained optimization problem. Then we reconstruct the brightness transformation function (BTF) in traditional CRM as a multi-scale generative network, completely decoupling it from the CRF. Meanwhile, we design a dual-branch contrastive encoder that enables the BTF to regulate the irradiance enhancement process through multi-scale exposure distributions learned from guide images. This offers a flexible BTF interface supporting stable and controllable domain generalization for image enhancement. Through comprehensive experiments, our method significantly advances domain generalization capabilities in adaptive image enhancement, outperforming specialized counterparts by margins of +1.226 (UIQM) averaged across challenging unseen underwater domains.
Supplementary Material: pdf
Primary Area: applications to robotics, autonomy, planning
Submission Number: 3333
Loading