Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set

Published: 27 Jun 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Authors that are also TMLR Expert Reviewers: ~Taco_Cohen1
Abstract: We introduce a video compression algorithm based on instance-adaptive learning. On each video sequence to be transmitted, we finetune a pretrained compression model. The optimal parameters are transmitted to the receiver along with the latent code. By entropy-coding the parameter updates under a suitable mixture model prior, we ensure that the network parameters can be encoded efficiently. This instance-adaptive compression algorithm is agnostic about the choice of base model and has the potential to improve any neural video codec. On UVG, HEVC, and Xiph datasets, our codec improves the performance of a scale-space flow model by between 21% and 27% BD-rate savings, and that of a state-of-the-art B-frame model by 17 to 20% BD-rate savings. We also demonstrate that instance-adaptive finetuning improves the robustness to domain shift. Finally, our approach reduces the capacity requirements of compression models. We show that it enables a competitive performance even after reducing the network size by 70%.
Certifications: Expert Certification
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Changes 2023-06-23 (camera ready): - De-anonymized the paper and included acknowledgments. - Added FFNeRV (Lee 2022) to the related work section and to Figures 1 and 2. - Added supplementary materials which include sample videos and per-video results in a CSV file. - Updated a wrongly reported sparsity number in appendix D.5. - Added reproducibility statement. Changes 2023-05-08: - Flipped y-axes for Figure 2 for consistency with rate-distortion plots _(as requested by reviewer 6RHK) - Added a more detailed explanation of the relative rate-distortion plots _(as requested by reviewer 6RHK) Changes 2023-05-03: - Changed the introduction to put more emphasis on the motivation of our work _(as requested by reviewer DBHN)_. - Extended the related work section on neural implicit compression methods _(as requested by reviewer DBHN)_. - Added a paragraph to the related work section about model compression methods _(as requested by reviewers 6RHk)_. - Made the contribution relative to van Rozendaal et al. (2021) more explicit _(as requested by reviewers DBHN, J84y)_. - Fixed small spelling mistakes and improved writing quality _(as requested by reviewer 6RHk, DBHN)_. - Placed figures in the text closer to the corresponding paragraphs _(as requested by reviewer 6RHk)_. - Added a more detailed explanation about the interpolated curves in Figure 2 _(as requested by reviewer 6RHk)_. - Specified the hardware specification in the encoding time benchmark in Figure 5 _(as requested by reviewer J84y)_.
Supplementary Material: zip
Assigned Action Editor: ~Jia-Bin_Huang1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 852
Loading