Abstract: Robotic grasping is fundamental but still keeps challenges when handling noisy and incomplete visual input, such as the point cloud captured from light-reflected objects or cluttered scenes. Common works may fail to detect proper grasp poses in these situations. This paper presents GVN (Grasp- Vote-Net), an end-to-end 6-DoF (Degree of Freedom) grasp prediction model based on the point cloud. GVN predicts grasp poses by making different parts of the object vote grasp centers and then aggregating the votes to make a stable proposal. Moreover, to rapidly generate a training dataset based on real-world scenes, the human demonstration is combined with deep learning grasp quality evaluation. This paper proposes GEN (Grasp-Evaluate-Net) to help annotate grasp labels on collected scene data automatically. It only takes 1 hour to collect 20K point clouds for network training. Unlike the common works, this dataset generation method does not need simulated calculations or mesh models. Our experiments show GVN can achieve more robust performance than state-of-the-art works in dealing with noisy input.
0 Replies
Loading