Abstract: Optical flow estimation has always been a fundamental task in computer vision. Due to the ultra-wide field of view (FoV) of panoramic cameras, traditional perspective-based methods for optical flow estimation fail to adapt to the omnidirectional nature of 360° panoramic images, making optical flow estimation for panoramic images challenging. In this paper, we firstly transform panoramic images into a set of distortion-free tangent images to cover the entire FoV and extract tangent images features using CNN, solving the problem of significant distortion of equirectangular projection. Then, we introduce a stereo embedding module that adds stereoscopic features to the tangent images to make its globally consistent. Finally, we globally aggregate the distortion-free features of the encoder through transformer, which in turn enhances the image features to solve the large displacement of pixels. Extensive experimental results demonstrate that our method achieves state-of-the-art performance on the public dataset FlowScape and exhibits strong generalization capability.
Loading