Keywords: Dexterous Grasping, Synthetic Data, Generative Models
Abstract: Grasping in cluttered scenes remains highly challenging for dexterous hands due to the scarcity of data. To address this problem, we present a large-scale synthetic dataset, encompassing 1319 objects, 8270 scenes, and 426 million grasps. Beyond benchmarking, we also explore data-efficient learning strategies from grasping data. We reveal that the combination of a conditional generative model that focuses on local geometry and a grasp dataset that emphasizes complex scene variations is key to achieving effective generalization. Our proposed generative method outperforms all baselines in simulation experiments. Furthermore, it demonstrates zero-shot sim-to-real transfer through test-time depth restoration, attaining 91% real-world success rate, showcasing the robust potential of utilizing fully synthetic training data.
Supplementary Material: zip
Spotlight Video: mp4
Publication Agreement: pdf
Student Paper: yes
Submission Number: 655
Loading