# 3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors

Abstract: *Multi-view image compression is vital for 3D-related applications. To effectively model correlations between views, existing methods typically predict disparity between two views on a 2D plane, which works well for small disparities, such as in stereo images, but struggles with larger disparities caused by significant view changes. To address this, we propose a novel approach: learning-based multi-view image coding with 3D Gaussian geometric priors (3D-GP-LMVIC). Our method leverages 3D Gaussian Splatting to derive geometric priors of the 3D scene, enabling more accurate disparity estimation across views within the compression model. Additionally, we introduce a depth map compression model to reduce redundancy in geometric information between views. A multi-view sequence ordering method is also proposed to enhance correlations between adjacent views. Experimental results demonstrate that 3D-GP-LMVIC surpasses both traditional and learning-based methods in performance, while maintaining fast encoding and decoding speed.*

## Setup
To install the required dependencies, run:

```shell
pip install -r requirements.txt
```

Additionally, please install [`diff-gaussian-rasterization-w-depth`](https://github.com/JonathonLuiten/diff-gaussian-rasterization-w-depth) and [`simple-knn`](https://github.com/dreamgaussian/dreamgaussian). We recommend installing them in an environment with CUDA 11.x and GCC 9.4.0, as higher versions of GCC may lead to installation issues.

## Data Preparation
Using the Auditorium scene from the Tanks&Temples dataset as an example, the expected data structure is:

```
Tanks&Temples
|---Auditorium
    |---images
        |---<image 0>
        |---<image 1>
        |---...
    |---sparse
        |---0
        |---cameras.bin
        |---images.bin
        |---points3D.bin
    |---scene_params
        |---point_cloud
            |---iteration_30000
                |---point_cloud.ply
|---...
```

The `sparse` folder is generated by COLMAP. You can find instructions for using COLMAP [here](https://github.com/graphdeco-inria/gaussian-splatting). After placing the Tanks&Temples dataset in the `gaussian_splatting/data` directory, run `gaussian_splatting/prepare_Tanks&Temples.sh` to generate the `scene_params` folder.

## Training and Evaluation
Please refer to `train.sh` and `eval.sh` for the training and evaluation scripts, respectively. When performing evaluation with entropy coding, it is important to include the `--dep_decoder_last_layer_double` parameter. Without this, numerical errors may lead to entropy decoding failures. This parameter ensures that the last layer of the depth map compression model's decoder uses double-precision calculations.
