Distilling Semantic Features for 3D Cloth Representations from Vision Foundation Models

Published: 24 Apr 2024, Last Modified: 24 Apr 2024ICRA 2024 Workshop on 3D Visual Representations for Robot ManipulationEveryoneRevisionsBibTeXCC BY 4.0
Keywords: 3D Cloth Representations, Semantic Features, Vision Foundation Models
TL;DR: This study investigates the advantages and challenges of vision foundation models in augmenting 3D representations of cloth-like deformable objects by extracting semantic information from RGB images.
Abstract: This study explores the use of vision foundation models to enhance 3D representations of cloth-like deformable objects. By focusing on the distillation of semantic information from RGB images, we examine the potential of pre-trained Visual-Language Models in capturing complex folded configurations of cloth. Our investigation reveals the challenges and preliminary successes in leveraging semantic information to improve the understanding and tracking of deformable object states.
Submission Number: 22
Loading