Codec Avatar Studio: Paired Human Captures for Complete, Driveable, and Generalizable Avatars

Published: 26 Sept 2024, Last Modified: 13 Nov 2024NeurIPS 2024 Track Datasets and Benchmarks PosterEveryoneRevisionsBibTeXCC BY-NC 4.0
Keywords: avatars, codec avatars, photorealistic avatars, human modelling, 3d reconstruction, telepresence
Abstract: To build photorealistic avatars that users can embody, human modelling must be complete (cover the full body), driveable (able to reproduce the current motion and appearance from the user), and generalizable (_i.e._, easily adaptable to novel identities). Towards these goals, _paired_ captures, that is, captures of the same subject obtained from systems of diverse quality and availability, are crucial. However, paired captures are rarely available to researchers outside of dedicated industrial labs: _Codec Avatar Studio_ is our proposal to close this gap. Towards generalization and driveability, we introduce a dataset of 256 subjects captured in two modalities: high resolution multi-view scans of their heads, and video from the internal cameras of a headset. Towards completeness, we introduce a dataset of 4 subjects captured in eight modalities: high quality relightable multi-view captures of heads and hands, full body multi-view captures with minimal and regular clothes, and corresponding head, hands and body phone captures. Together with our data, we also provide code and pre-trained models for different state-of-the-art human generation models. Our datasets and code are available at https://github.com/facebookresearch/ava-256 and https://github.com/facebookresearch/goliath.
Supplementary Material: pdf
Flagged For Ethics Review: true
Submission Number: 1059
Loading