LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Cinemagraph is a unique form of visual media that combines elements of still photography and subtle motion to create a captivating experience. However, the majority of videos generated by recent works lack depth information and are confined to the constraints of 2D image space. In this paper, inspired by significant progress in the field of novel view synthesis (NVS) achieved by 3D Gaussian Splatting (3D-GS), we propose \textbf{\textit{LoopGaussian}} to elevate cinemagraph from 2D image space to 3D space using 3D Gaussian modeling. To achieve this, we first employ the 3D-GS method to reconstruct 3D Gaussian point clouds from multi-view images of static scenes, incorporating shape regularization terms to prevent blurring or artifacts caused by object deformation. We then adopt an autoencoder tailored for 3D Gaussian to project it into feature space. To maintain the local continuity of the scene, we devise SuperGaussian for clustering based on the acquired features. By calculating the similarity between clusters and employing a two-stage estimation method, we derive an Eulerian motion field to describe velocities across the entire scene. The 3D Gaussian points then move within the estimated Eulerian motion field. Through bidirectional animation techniques, we ultimately generate a 3D Cinemagraph that exhibits natural and seamlessly loopable dynamics. Experiment results validate the effectiveness of our approach, demonstrating high-quality and visually appealing scene generation.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Generation] Generative Multimedia
Relevance To Conference: The contribution described in this work significantly enhances multimedia/multimodal processing, particularly in the realm of cinemagraphs. Here's how: (1). Expanding Cinemagraph to 3D Space: Cinemagraphs are typically confined to 2D image space, limiting the depth and immersion of the experience. This work proposes "LoopGaussian," which elevates cinemagraphs to a 3D space based on 3D Gaussian representations. By doing so, it adds an extra dimension to the cinemagraph, making it more immersive and visually engaging.(2) Integration of Novel View Synthesis (NVS) Techniques: By leveraging 3D Gaussian Splatting (3D-GS) methods, the work incorporates advanced techniques from the field of novel view synthesis. This integration allows for the generation of novel views of static scenes, enriching the user experience by providing different perspectives within the cinemagraph.(3)Shape Regularization for Deforming Objects: Incorporating shape regularization terms ensures that objects within the cinemagraph maintain their shape and integrity, even as they undergo deformation. This feature enhances the visual quality of the cinemagraph by preventing blurring or artifacts that may arise from object motion.(4)Autoencoder for 3D Gaussian Representation: The design of an autoencoder for 3D Gaussian representations facilitates the projection of 3D Gaussian data into a feature space without the need for prior knowledge. This allows for efficient processing and manipulation of the data, contributing to the overall effectiveness of the proposed method.(5) Clustering Approach Inspired by Superpixels: Introducing a clustering approach for 3D Gaussian point clouds, inspired by superpixels, enhances the organization and segmentation of the data. This approach aids in the identification and characterization of motion patterns within the cinemagraph, contributing to the creation of more coherent and visually appealing effects. (6) Employment of Euler Motion Field and Symmetric Motion Technology: The use of Euler motion field characterization and symmetric motion technology enables the creation of seamless loop effects within the cinemagraph. This ensures that the motion appears natural and continuous, further enhancing the immersive experience for the viewer. Overall, this work significantly advances multimedia/multimodal processing by extending the capabilities of cinemagraphs into 3D space, integrating advanced NVS techniques, and employing innovative approaches for motion characterization and synthesis.
Submission Number: 226
Loading