Abstract: Rolling shutter correction (RSC) is becoming increasingly popular for RS cameras that are widely used in commercial and industrial applications. Despite the promising performance, existing RSC methods typically employ a two-stage network structure that ignores intrinsic infor-mation interactions and hinders fast inference. In this pa-per, we propose a single-stage encoder-decoder-based network, named JAMNet, for efficient RSC. It first extracts pyramid features from consecutive RS inputs, and then simultaneously refines the two complementary information (i.e., global shutter appearance and undistortion motion field) to achieve mutual promotion in a joint learning de-coder. To inject sufficient motion cues for guiding joint learning, we introduce a transformer-based motion embed-ding module and propose to pass hidden states across pyra-mid levels. Moreover, we present a new data augmentation strategy “vertical flip + inverse order” to release the potential of the RSC datasets. Experiments on various benchmarks show that our approach surpasses the state-of-the-art methods by a large margin, especially with a 4.7 dB PSNR leap on real-world RSC. Code is available at https://github.com/GitCVfb/JAMNet.
Loading