Abstract: Recent advances in dynamic garment reconstruction boost virtual try-on with monocular video streams as inputs. However, existing literature has been intensively conducted on its sub-tasks, including static garment reconstruction and dynamic clothed human reconstruction, which are difficult to extend to dynamic garment reconstruction.
The former bottleneck is mainly the lack of cross-frame correspondences and independent clothing topology on implicit garment fields, which result in the inability to obtain accurate motion information during dynamic clothing reconstruction and the absence of stable topology critical for downstream tasks, such as animation with physics engines. The latter usually binds the garment motion with body or skeleton motions, leading to rigid artifacts for loose-fitting garments.
Our key idea is to build a diffusion based T-pose garments generator with a strong prior on garments structure. The garment generator is trained to generate 2D clothing representation, termed FOSUP, conditioned by a monocular video. \fftsup is defined as \underline{FO}urier \underline{S}pherical \underline{U}nwr\underline{a}pping, enables a bidirectional mapping between \fftsup and the mesh through FFT and inverse FFT, which maintain spatial order and adjacency. Subsequently, this \fftsup is mapped back to 3D meshes through an inverse FFT process and transformed into pose space through a point transformation network to guide the three-dimensional reconstruction of the entire sequence.
To sufficiently train our framework and address the lack of domain-specific data, we have constructed a large-scale garment MoCap dataset. This dataset captures the motion of various loose garments and includes multi-view raw images, frame-by-frame human motion annotations, raw scanned point clouds, topology-independent garment templates, and garment meshes with cross-frame correspondences.
Comprehensive experiments have demonstrated that our unwrapping-based representation and diffusion-based framework significantly improve the performance and Robustness of dynamic garment reconstruction.
Loading