Keywords: Video compression, Implicit neural representations
Abstract: Recent advances in implicit neural representation (INR)-based video coding have
demonstrated its potential to compete with both conventional and other learning-
based approaches. With INR methods, a neural network is trained to overfit a
video sequence, with its parameters compressed to obtain a compact representation
of the video content. However, although promising results have been achieved,
the best INR-based methods are still out-performed by the latest standard codecs,
such as VVC VTM, partially due to the simple model compression techniques
employed. In this paper, rather than focusing on representation architectures, which
is a common focus in many existing works, we propose a novel INR-based video
compression framework, Neural Video Representation Compression (NVRC),
targeting compression of the representation. Based on its novel quantization and
entropy coding approaches, NVRC is the first framework capable of optimizing an
INR-based video representation in a fully end-to-end manner for the rate-distortion
trade-off. To further minimize the additional bitrate overhead introduced by the
entropy models, NVRC also compresses all the network, quantization and entropy
model parameters hierarchically. Our experiments show that NVRC outperforms
many conventional and learning-based benchmark codecs, with a 23% average
coding gain over VVC VTM (Random Access) on the UVG dataset, measured
in PSNR. As far as we are aware, this is the first time an INR-based video codec
achieving such performance.
Primary Area: Machine vision
Submission Number: 4742
Loading