Dynamic Scene Reconstruction from Single Landscape Image Using 4D Gaussian in the Wild

Published: 09 Sept 2024, Last Modified: 12 Sept 2024ECCV 2024 Wild3DEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Dynamic Scene Video, 4D Gaussians
Abstract: Based on the outstanding performance of 3D Gaussian splatting, recent multi-view 3D modeling studies have expanded to 4D Gaussians. By jointly learning the temporal axis with 3D Gaussians, it is possible to reconstruct more realistic and immersive 4D scenes from multi-view landscape images. However, obtaining multi-view images that accurately reflect the overall motion in the wild is extremely challenging. In the dynamic scene video field, pseudo-3D representation methods combine with Layered Depth Images (LDIs), which allows elements to render new scenes from different camera perspectives. LDIs, a simplified 3D representation of separating a single image into depth-based layers, have limitations in reconstructing complex scenes, and artifacts can occur when continuous elements like fluids are separated into layers. This paper proposes representing a complete 3D space for dynamic scene videos by modeling explicit representations, specifically 4D Gaussians, from a single image. The framework is focused on optimizing 3D Gaussians by generating multi-view images from a single image and creating 3D motion to optimize 4D Gaussians. A key aspect is consistent 3D motion estimation, which aligns common motion across multi-view images to bring 3D space motion closer to actual motions. Our model shows the ability to deliver realistic immersion in the wild landscape images through various experiments and metrics. Extensive experimental results are https://cvsp-lab.github.io/3D_MRM_page/
Submission Number: 30
Loading