3D-LMVIC: Learning-based Multi-View Image Compression with 3D Gaussian Geometric Priors

Yujun Huang; Bin Chen; Niu Lian; Xin Wang; Baoyi An; Tao Dai; Shu-Tao Xia

3D-LMVIC: Learning-based Multi-View Image Compression with 3D Gaussian Geometric Priors

Yujun Huang, Bin Chen, Niu Lian, Xin Wang, Baoyi An, Tao Dai, Shu-Tao Xia

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Existing multi-view image compression methods often rely on 2D projection-based similarities between views to estimate disparities. While effective for small disparities, such as those in stereo images, these methods struggle with the more complex disparities encountered in wide-baseline multi-camera systems, commonly found in virtual reality and autonomous driving applications. To address this limitation, we propose 3D-LMVIC, a novel learning-based multi-view image compression framework that leverages 3D Gaussian Splatting to derive geometric priors for accurate disparity estimation. Furthermore, we introduce a depth map compression model to minimize geometric redundancy across views, along with a multi-view sequence ordering strategy based on a defined distance measure between views to enhance correlations between adjacent views. Experimental results demonstrate that 3D-LMVIC achieves superior performance compared to both traditional and learning-based methods. Additionally, it significantly improves disparity estimation accuracy over existing two-view approaches.

Lay Summary: Modern applications like virtual reality use many cameras to capture scenes from different angles, producing large amounts of image data. Compressing this data is challenging, especially when camera views are far apart. Our method, 3D-LMVIC, uses 3D scene understanding to better align and compress images from different views. It achieves higher compression efficiency and quality by estimating 3D geometry and reordering the views smartly. This helps reduce storage and transmission costs in real-world 3D applications.

Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.

Link To Code: https://github.com/YujunHuang063/3D-GP-LMVIC

Primary Area: Applications->Computer Vision

Keywords: Multi-View Image Compression; 3D Gaussian Splatting; Deep Learning

Submission Number: 6156

Loading