Abstract: Stereo image Super-Resolution (SR) has made significant progress since binocular systems are widely accepted in recent years. Most stereo SR methods focus on improving the PSNR performance, while their visual quality is over-smoothing and lack of detail. Perceptual-oriented SR methods are mainly designed for single-view images, thereby their performance decreases on stereo SR due to stereo inconsistency. We propose a perceptual-oriented stereo SR framework that considers both single-view and cross-view information, noted as SC-NAFSSR. With NAF-SSR [3] as our backbone, we combine LPIPS-based perceptual loss and VGG-based perceptual loss for perceptual training. To improve stereo consistency, we perform supervision on each Stereo Cross-Attention Module (SCAM) with stereo consistency loss [27], which calculates photometric loss, smoothness loss, and cycle loss using the cycle-attention maps and valid masks of SCAM. Furthermore, we propose training strategies to fully exploit the performance on perceptual-oriented stereo SR. Both extensive experiments and ablation studies demonstrate the effectiveness of our proposed method. In particular, SC-NAFSSR outperforms the SOTA methods on Flickr1024 dataset [30]. In the NTIRE 2023 Stereo Image Super-Resolution Challenge Track 2 Perceptual & Bicubic [26], SC-NAFSSR ranked 2nd place on the leaderboard. Our source code is available at https://github.com/FVL2020/SC-NAFSSR.
0 Replies
Loading