Enhancing Vision: Harmonizing Frequency for Imaging Quality and Perception Accuracy

Published: 2025, Last Modified: 07 Jan 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In low-level vision tasks, achieving harmony between visual quality and recognition accuracy is often challenging, as the two do not always align. Many existing approaches focus on optimizing downstream tasks by linking image quality to machine perception, typically incurring additional burdens such as extensive annotations and joint training. In this work, we demonstrate that independent low-level reconstruction algorithms can simultaneously enhance imaging quality and downstream perception accuracy. By conducting a comprehensive frequency-domain analysis, we identify high-frequency components as critical for both visual and perceptual tasks. To counteract the frequency information loss typically seen in ISP pipelines, we propose a Multi-Frequency Fusion Block (MFFB) for on-the-fly upsampling, alongside a Frequency-Aware Supervision (FAS) mechanism guided by discrete wavelet transform. Our method achieves a notable +0.32 dB improvement in smart ISP performance on the Zurich dataset. Moreover, without relying on assistance from downstream tasks, our approach demonstrates significant improvements in object detection and instance segmentation.
Loading