Real-time video super resolution network using recurrent multi-branch dilated convolutions

Yubin Zeng, Zhijiao Xiao, Kwok-Wai Hung, Simon Lui

2021 (modified: 25 Feb 2025)Signal Process. Image Commun. 2021Readers: Everyone

Abstract: Highlights • A new multi-branch dilated module to effectively improve network receptive field. • Extracting spatial-temporal features at different scales in parallel. • Resulting in superior performance with minimal computational costs. • A new recurrent architecture to process a consecutive multi-frame sequence. • Extracting temporal and spatial features simultaneously for super-resolution. • The proposed network can reconstruct high definition video clip up to 50 fps. Abstract Recent developments of video super-resolution reconstruction often exploit spatial and temporal contexts from input frame sequence by making use of explicit motion estimation, e.g., optical flow, which may introduce accumulated errors and requires huge computations to obtain an accurate estimation. In this paper, we propose a novel multi-branch dilated convolution module for real-time frame alignment without explicit motion estimation, which is incorporated with the depthwise separable up-sampling module to formulate a sophisticated real-time video super-resolution network. Specifically, the proposed video super-resolution framework can efficiently acquire a larger receptive field and learn spatial–temporal features of multiple scales with minimal computational operations and memory requirements. Extensive experiments show that the proposed super-resolution network outperforms current state-of-the-art real-time video super-resolution networks, e.g., VESPCN and 3DVSRnet, in terms of PSNR values (0.49 dB and 0.17 dB) on average in various datasets, but requires less multiplication operations.

0 Replies