Multi-View Consistent 3D GAN Inversion via Bidirectional Encoder

Haozhan Wu, Hu Han, Shiguang Shan, Xilin Chen

Published: 01 Jan 2024, Last Modified: 05 Mar 2025FG 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: 3D GAN inversion enables not only 3D reconstruction from a 2D image, but also novel view synthesis and image editing. Existing works ensure the novel view synthesis quality by constraining the synthesized views to conform to the real image distribution. However, most of the methods did not consider the multi-view consistency, i.e., different photos of the same 3D scene via 3D GAN inversion can be inverted to the same 3D scene. In this paper, we propose a bidirectional encoder (BiDiE) for 3D GAN inversion that can improve the multi-view consistency and alleviate the interference of camera parameter prediction errors. On the one hand, the bidirectional encoder takes real images as input, estimates the camera parameters, and performs 3D reconstruction. On the other hand, the bidirectional encoder takes randomly sampled latent code and camera parameters as input, and generates synthesized images to assist in the latent code learning process. In addition, we extend the latent space from W+ to W++ to improve its reconstruction and editing capabilities. Experiments on the FFHQ, CelebA-HQ and Multi-PIE datasets prove that our proposed method outperforms state-of-the-art methods in multi-view consistent reconstruction as well as editing capability. 1 1 Code and datasets are available at https://github.com/WHZMM/BiDiE