Rate-aware Compression for NeRF-based Volumetric Video

Zhiyu Zhang; Guo Lu; Huanxiong Liang; Zhengxue Cheng; Anni Tang; Li Song

Rate-aware Compression for NeRF-based Volumetric Video

Zhiyu Zhang, Guo Lu, Huanxiong Liang, Zhengxue Cheng, Anni Tang, Li Song

Published: 20 Jul 2024, Last Modified: 31 Jul 2024MM2024 OralEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The neural radiance fields (NeRF) have advanced the development of 3D volumetric video technology, but the large data volumes they involve pose significant challenges for storage and transmission. To address these problems, the existing solutions typically compress these NeRF representations after the training stage, leading to a separation between representation training and compression. In this paper, we try to directly learn a compact NeRF representation for volumetric video in the training stage based on the proposed rate-aware compression framework. Specifically, for volumetric video, we use a simple yet effective modeling strategy to reduce temporal redundancy for the NeRF representation. Then, during the training phase, an implicit entropy model is utilized to estimate the bitrate of the NeRF representation. This entropy model is then encoded into the bitstream to assist in the decoding of the NeRF representation. This approach enables precise bitrate estimation, thereby leading to a compact NeRF representation. Furthermore, we propose an adaptive quantization strategy and learn the optimal quantization step for the NeRF representations. Finally, the NeRF representation can be optimized by using the rate-distortion trade-off. Our proposed compression framework can be used for different representations and experimental results demonstrate that our approach significantly reduces the storage size with marginal distortion and achieves state-of-the-art rate-distortion performance for volumetric video on the HumanRF and ReRF datasets. Compared to the previous state-of-the-art method TeTriRF, we achieved an approximately -80\% BD-rate on the HumanRF dataset and -60\% BD-rate on the ReRF dataset.

Primary Subject Area: [Content] Media Interpretation

Secondary Subject Area: [Systems] Transport and Delivery, [Experience] Multimedia Applications

Relevance To Conference: With the advancement of AR/VR technology, immersive 3D volumetric videos increasingly play a crucial role in the multimedia domain. The advent of Neural Radiance Fields (NeRF) has propelled the development of immersive videos, yet their substantial data size poses a challenge for transmission. In response to the large data volumes associated with dynamic NeRF-based volumetric videos, we have introduced a corresponding compression algorithm that significantly reduces storage requirements. These advancements aid in enhancing the efficiency of compression and decompression, which is vital for handling 3D multimedia content. High-performance compression facilitates transmission, which can further the application and development of immersive multimedia, broadening the range of uses in virtual reality, augmented reality, and other multimedia platforms.

Supplementary Material: zip

Submission Number: 2022

Loading