RightSizing: Disentangling Generative Models of Human Body Shapes with Metric Constraints

Published: 13 May 2024, Last Modified: 28 May 2024GI 2024 SDEveryoneRevisionsBibTeXCC BY 4.0
Letter Of Changes: * Typo Fix: 1. "from from" raised by Review1. 2. Fig 7. caption. Two figure (a). * Video Update: 1. Add the authors and affiliations in the cover page. * Acknowledgement: 1. Add Acknowledgment * Code Upload: 1. Add a Github Repo at the end of Introduction Section. 2. We have uploaded the code and trained model to the repo. 3. The repo is private right now. Once the paper is published, we will set it to be public. * Reviewer1: 1. "MakeHuman": We cite the paper and the open source project at the end of Section 2.1.1. * Review2: 1. "computing the feature derivative": We add a sentence in the beginning of Section 4.3. 2. "Discussion of result using extreme values": We add a paragraph at the end of Section 5.3.
Keywords: Human shape generation, disentanglement, generative models, latent representation, human body sizing
Abstract: Learning latent representations of 3D meshes is an effective technique for understanding the shape space of the human body. It compresses data by representing complex 3D models as compact vectors, drastically reducing storage needs while maintaining high fidelity during reconstruction. With these representations, new shapes can be generated by sampling the latent space. In applications, it enables rapid prototyping for product design, product customization to fit human shape, and personalized avatars in virtual worlds. A simple and widely used way to obtain latent representations is principal component analysis (PCA). It achieves a reduction in dimensionality by extracting orthogonal components, representing the most significant sources of variability in high-dimensional data while ensuring their independence. In recent years, there has been significant interest in using deep generative models to learn latent representations. These models tend to have fewer parameters and have been demonstrated to achieve better reconstruction accuracy and may generalize to larger deformations. For most graphics applications, however, it is not random shape generation that is useful but we often want to generate shapes with certain properties. This requires the latent variables to be interpretable and meaningful. Recent efforts have aimed to ``disentangle'' the latent space to improve interpretability. Although there is no universally accepted formal definition of disentanglement, the consensus is that such a representation should isolate different data variation factors. Specifically, a change in one factor of variation should result in a change in just one component of the learned representation.
Supplementary Material: zip
Video: zip
Submission Number: 17
Loading