NVRC: Neural Video Representation Compression

Ho Man Kwan; Ge Gao; Fan Zhang; Andrew Peter Gower; David Bull

NVRC: Neural Video Representation Compression

Ho Man Kwan, Ge Gao, Fan Zhang, Andrew Peter Gower, David Bull

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Video compression, Implicit neural representations

Abstract: Recent advances in implicit neural representation (INR)-based video coding have demonstrated its potential to compete with both conventional and other learning- based approaches. With INR methods, a neural network is trained to overfit a video sequence, with its parameters compressed to obtain a compact representation of the video content. However, although promising results have been achieved, the best INR-based methods are still out-performed by the latest standard codecs, such as VVC VTM, partially due to the simple model compression techniques employed. In this paper, rather than focusing on representation architectures, which is a common focus in many existing works, we propose a novel INR-based video compression framework, Neural Video Representation Compression (NVRC), targeting compression of the representation. Based on its novel quantization and entropy coding approaches, NVRC is the first framework capable of optimizing an INR-based video representation in a fully end-to-end manner for the rate-distortion trade-off. To further minimize the additional bitrate overhead introduced by the entropy models, NVRC also compresses all the network, quantization and entropy model parameters hierarchically. Our experiments show that NVRC outperforms many conventional and learning-based benchmark codecs, with a 23% average coding gain over VVC VTM (Random Access) on the UVG dataset, measured in PSNR. As far as we are aware, this is the first time an INR-based video codec achieving such performance.

Primary Area: Machine vision

Submission Number: 4742

Loading