Abstract: How to effectively explore inter-frame information is critical for video denoising. Existing methods often rely on complex architectures, such as optical flow estimation and cross-frame self-attention, which introduce high computational costs and limit their practicality in real-world scenarios. To address this limitation, we propose a simple yet efficient deep Frequency-Separable Temporal Network (FSTN) for video denoising. FSTN utilizes the multi-scale analysis capability of wavelet transform to extract high-frequency and low-frequency information at the feature level, enabling faster processing while maintaining high-quality reconstruction. To further reduce computational complexity and enhance detail preservation, we develop a learnable high-frequency processing module that adaptively filters noise and recovers edge details. Additionally, to effectively utilize information from long-range frames, we propose a low-frequency propagation method equipped with a temporal feature alignment module. This method enables the efficient transfer of structural information from distant frames, ensuring temporal consistency and enhancing denoising performance. Extensive experiments demonstrate that our method has 1.28× fewer network parameters than state-of-the-art efficient video denoising methods, such as BasicVSR++, and requires less computational cost while achieving comparable performance.
External IDs:dblp:journals/tmm/TaoWYPT25
Loading