NHVC: Neural Holographic Video Compression with Scalable Architecture

Hyunmin Ban, Seungmi Choi, Jun Yeong Cha, Yeongwoong Kim, Hui Yong Kim

Published: 2024, Last Modified: 12 Apr 2025VR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recently, neural network-based approaches for hologram generation and compression have gained popularity as they allow for efficient inference on GPUs without the need for iterative optimization required in traditional methods. In this paper, we introduce Neural Holographic Video Compression (NHVC), an end-to-end trainable and scalable model designed for high-quality phase hologram video generation and compression. NHVC consists of an auto-encoder-based phase hologram generator, a latent coder and-two hyper-prior coders. For each input image, the latent features are extracted through the encoder part of the phase generator and then entropy coded at the shared latent coder based on the hyper-prior information. The two hyper-prior coders employ a spatial and a spatio-temporal entropy model for I-frames and P-frames, respectively. With this architecture, our NHVC can offer task-scalability, allowing a single trained model to serve as a phase hologram generator, phase hologram image compressor, or phase hologram video compressor as required.Experimental results on phase hologram video compression with UVG dataset show that our model outperforms ‘HoloNet + VVC’ by 75.6% BD-Rate reduction, with modest 2K encoding and decoding speeds (5 fps and 12 fps, respectively). For the phase hologram video generation task, our model showed much higher-quality (almost 42dB PSNR) reconstruction using the UVG dataset, while the previous neural generation model HoloNet provides at most 36dB reconstruction quality. We also provide an extensive experimental study on several important design questions such as the need for quadruple extension (QE) in the neural compression model, the feasibility of motion estimation in the phase domain, and an alternative, the need for increasing receptive field to learn better phase features, and variable rate support with a single trained model. It is noteworthy that our model is the first and best neural phase video compression model providing such high-quality reconstruction and task-scalability.