MHAVSR: A multi-layer hybrid alignment network for video super-resolution

Xintao Qiu; Yuanbo Zhou; Xinlin Zhang; Yuyang Xue; Xiaoyong Lin; Xinwei Dai; Hui Tang; Guoyang Liu; Rui Yang; Zhen Liu; Xiaojing Wei; Junxiu Yang; Tong Tong; Qinquan Gao

MHAVSR: A multi-layer hybrid alignment network for video super-resolution

Xintao Qiu, Yuanbo Zhou, Xinlin Zhang, Yuyang Xue, Xiaoyong Lin, Xinwei Dai, Hui Tang, Guoyang Liu, Rui Yang, Zhen Liu, Xiaojing Wei, Junxiu Yang, Tong Tong, Qinquan Gao

Published: 01 Jan 2025, Last Modified: 16 Apr 2025Neurocomputing 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Video super-resolution (VSR) aims to restore high-resolution (HR) frames from low-resolution (LR) frames, the key to this task is to fully utilize the complementary information between frames to reconstruct high-resolution sequences. Current works tackle with this by exploiting a sliding window strategy or a recurrent architecture for single alignment, which either lacks long range modeling ability or is prone to frame-by-frame error accumulation. In this paper, we propose a Multi-layer Hybrid Alignment network for VSR (MHAVSR), which combines a sliding window with a recurrent structure and extends the number of propagation layers based on this hybrid structure. Repeatedly, at each propagation layer, alignment operations are performed simultaneously on bidirectional neighboring frames and hidden states from recursive propagation, which improves the alignment while fully utilizing both the short-term and long-term information in the video sequence. Next, we present a flow-enhanced dual-deformable alignment module, which improves the accuracy of deformable convolutional offsets by optical flow and fuses the separate alignment results of the hybrid alignment to reduce the artifacts caused by alignment errors. In addition, we introduce a spatial–temporal reconstruction module to compensate the representation capacity of model at different scales. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches. In particular, on the Vid4 test set, our model exceeds the IconVSR by 0.82 dB in terms of PSNR with a similar number of parameters. Codes are available at https://github.com/fzuqxt/MHAVSR.

Loading