TL;DR: We propose adaptation 3D Gaussian Splatting for 2D images which allows real-life modification of images.
Abstract: Implicit Neural Representations (INRs) approximate discrete data through continuous functions and are commonly used for encoding 2D images. Traditional image-based INRs employ neural networks to map pixel coordinates to RGB values, capturing shapes, colors, and textures within the network’s weights. Recently, GaussianImage has been proposed as an alternative, using Gaussian functions instead of neural networks to achieve comparable quality and compression. Such a solution obtains a quality and compression ratio similar to classical INR models but does not allow image modification. In contrast, our work introduces a novel method, MiraGe, which uses mirror reflections to perceive 2D images in 3D space and employs flat-controlled Gaussians for precise 2D image editing. Our approach improves the rendering quality and allows realistic image modifications, including human-inspired perception of photos in the 3D world. Thanks to modeling images in 3D space, we obtain the illusion of 3D-based modification in 2D images. We also show that our Gaussian representation can be easily combined with a physics engine to produce physics-based modification of 2D images. Consequently, MiraGe allows for better quality than the standard approach and natural modification of 2D images.
Lay Summary: How do people perceive flat 2D objects such as a photograph or a sheet of paper in the 3D world? Inspired by this question, our work explores 2D image representation through the lens of human perception by introducing the MiraGe model. Just as people intuitively understand and manipulate physical photographs by rotating, bending or reorienting them in 3D space, our model allows image representation to simulate this perceptual process. In order to represent an object in 3D, we use parameterized 3D Gaussian primitives, which we control. This allows us not only to represent the image well but also to edit it. While the scene is not fully reconstructed in three dimensions, the use of 3D Gaussians to represent a 2D image allows the model to create new views through camera movement. This technique produces a perceptual 2.5D effect, a strategy frequently employed in computer graphics and video games in otherwise flat, distant backgrounds.
Link To Code: https://github.com/waczjoan/MiraGe/
Primary Area: Applications->Computer Vision
Keywords: Gaussian Splatting, image reconstruction, INR
Submission Number: 7519
Loading