Keywords: Video Compression, Diffusion Transformer
TL;DR: We propose Flow-IB, a generative video compression framework that compresses videos 32,768× by combining information bottleneck and flow matching, outperforming existing codecs.
Abstract: We present a generative video compression framework that achieves an unprecedented 32,768$\times$ compression ratio by transmitting only the first and last frames as I-frames and reconstructing the remaining content with a flow-matching video diffusion model. Guided by the information bottleneck principle, our method introduces a differentiable loss that minimizes mutual information with the known I-frames, enabling joint optimization of compression and generation within a unified framework. This design allows the generative model to faithfully reconstruct intermediate frames at extreme compression rates. Extensive experiments demonstrate that our approach substantially outperforms both traditional codecs and recent deep learning–based schemes across standard rate–distortion metrics. Moreover, the reconstructed videos deliver comparable performance to state-of-the-art semantic communication methods across multiple downstream tasks, demonstrating the strong potential of generative compression as a practical alternative to conventional coding.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 2885
Loading