360VFI: A Dataset and Benchmark for Omnidirectional Video Frame Interpolation

TMLR Paper3313 Authors

09 Sept 2024 (modified: 29 Sept 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Head-mounted 360° displays and portable 360° cameras have significantly progressed, providing viewers a realistic and immersive experience. However, many omnidirectional videos have low frame rates that can lead to visual fatigue, and the prevailing plane frame interpolation methodologies are unsuitable for omnidirectional video interpolation because they are designed solely for traditional videos. This paper introduces the benchmark dataset, 360VFI, for Omnidirectional Video Frame Interpolation. We present a practical implementation that introduces a distortion prior from omnidirectional video into the network to modulate distortions. Specifically, we propose a pyramid distortion-sensitive feature extractor that uses the unique characteristics of equirectangular projection (ERP) format as prior information. Moreover, we devise a decoder that uses an affine transformation to further facilitate the synthesis of intermediate frames. 360VFI is the first dataset and benchmark that explores the challenge of Omnidirectional Video Frame Interpolation. Through our benchmark analysis, we present four different distortion condition scenes in the proposed 360VFI dataset to evaluate the challenges triggered by distortion during interpolation. Besides, experimental results demonstrate that Omnidirectional Video Interpolation can be effectively improved by modeling for omnidirectional distortion.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Hongsheng_Li3
Submission Number: 3313
Loading