3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models

17 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: human pose and shape estimation, generative models, human pose dataset, automatic annotation
TL;DR: generate high quality human images as well as corresponding 2D/3D annotations using controllable diffusion models
Abstract: Human pose and shape estimation from monocular images play a fundamental role in computer vision applications such as augmented reality, virtual try-on, and human motion analysis. However, it is laborious to annotate 2D skeleton keypoints and it is even more expensive to obtain high-quality 3D human meshes using motion capture or computer graphics rendering techniques. In this work, we propose an effective approach based on recent diffusion models, termed \emph{\Ours}, which can effortlessly generate human images and corresponding 2D human skeletons and 3D mesh annotations. Specifically, we first leverage a multi-conditioned stable diffusion model to generate diverse human images and initial ground-truth labels. At the core of this step is that we can easily obtain numerous depth and keypoints conditions from a 3D human parametric model, e.g., SMPL-X, by rendering the 3D mesh onto the image plane. The generated human image and the corresponding 3D mesh with camera parameters can be regarded as a pair of training samples. As there exists inevitable noise in the initial labels, we then cast the problem into a label-denoising process by employing an off-the-shelf 2D human pose estimator to filter negative data pairs and further optimize the pose parameters. Finally, we can build a unified human pose dataset with both 2D skeleton and 3D parametric model annotations. Experiments on 2D datasets (COCO, OCHuman) and 3D datasets (3DPW, RICH, SSP-3D) demonstrate the effectiveness of our approach. Thus, our method offers a promising avenue for advancing the field of human pose and shape estimation by generating large-scale human images and high-quality annotations in a fully automated fashion.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 849
Loading