TSARN: A Joint Temporal-Spatial-Angular Reconstruction Network for Light Field Lenslet Video Compression
Abstract: Light Field (LF) lenslet videos are capable of capturing the aggregation of light rays in dynamic scenes, providing a more immersive experience. However, the immense data volume of LF lenslet videos poses significant challenges for storage and transmission. Existing methods that utilize traditional video codecs to encode all views fail to meet the demands for efficient compression. This paper proposes a compression framework based on down-sampling, which focuses on designing an efficient reconstruction method at the decoder side. In LF lenslet videos, pixels that are invisible in the current view may be visible in adjacent views angularly and/or in adjacent frames temporally. Therefore, we propose a joint temporal-spatial-angular reconstruction network. In this network, Spatial-Angular Convolutional Module utilizes different forms of convolutions to fully extract spatial-angular features for synthesizing LF views initially; Deformable Convolutional Temporal Fusion Module employs deformable convolutions to align information from different frames and aggregate temporal features; View Refinement Module is finally adopted to further refine the features and reconstruct high-quality LF lenslet videos. Experimental results demonstrate that our proposed LF lenslet video compression framework achieves superior performance, significantly reducing the bitstream.
Loading