Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond AlgorithmsDownload PDF

06 Jun 2022, 14:05 (modified: 12 Oct 2022, 16:50)NeurIPS 2022 Datasets and Benchmarks Readers: Everyone
Keywords: Human Body Reconstruction, SMPL Model, 2D and 3D Pose, Pose and Shape Estimation, Human Mesh Recovery
TL;DR: Benchmarking different datasets, backbones and training strategies for 3D human pose and shape estimation
Abstract: 3D human pose and shape estimation (a.k.a. ``human mesh recovery'') has achieved substantial progress. Researchers mainly focus on the development of novel algorithms, while less attention has been paid to other critical factors involved. This could lead to less optimal baselines, hindering the fair and faithful evaluations of newly designed methodologies. To address this problem, this work presents the \textit{first} comprehensive benchmarking study from three under-explored perspectives beyond algorithms. \emph{1) Datasets.} An analysis on 31 datasets reveals the distinct impacts of data samples: datasets featuring critical attributes (\emph{i.e.} diverse poses, shapes, camera characteristics, backbone features) are more effective. Strategical selection and combination of high-quality datasets can yield a significant boost to the model performance. \emph{2) Backbones.} Experiments with 10 backbones, ranging from CNNs to transformers, show the knowledge learnt from a proximity task is readily transferable to human mesh recovery. \emph{3) Training strategies.} Proper augmentation techniques and loss designs are crucial. With the above findings, we achieve a PA-MPJPE of 47.3 \(mm\) on the 3DPW test set with a relatively simple model. More importantly, we provide strong baselines for fair comparisons of algorithms, and recommendations for building effective training configurations in the future. Codebase is available at \url{}.
Supplementary Material: pdf
Author Statement: Yes
Contribution Process Agreement: Yes
In Person Attendance: Yes
31 Replies