Abstract: Successful video deblurring relies on effectively using sharp pixels from other frames to recover the blurry pixels of the current frame. However, mainstream methods only use estimated optical flows to align and fuse features from adjacent frames without considering the pixel-wise blur levels, leading to the introduction of blurry pixels from adjacent frames. Furthermore, these methods fail to effectively exploit information from the entire input video. To address these limitations, we propose STDANet++, which redesigns the state-of-the-art method STDANet by introducing patch-based spatio-temporal deformable attention (PSTDA) module and long-term frame fusion (LTFF) module to the BiRNN-based structure. By effectively utilizing sharp information across the entire video, the proposed method outperforms state-of-the-art methods on the GoPro, DVD and BSD datasets, according to our experimental results. The source code is available at https://github.com/huicongzhang/STDANetPP.
Loading