Efficient Grasp Detection Network With Gaussian-Based Grasp Representation for Robotic Manipulation
Abstract: Deep learning methods have achieved excellent
results in the field of grasp detection. However, deep
learning-based models for general object detection lack
the proper balance of accuracy and inference speed, resulting
in poor performance in real-time grasp tasks. This
work proposes an efficient grasp detection network with
n-channel images as inputs for robotic grasp. The proposed
network is a lightweight generative structure for grasp detection
in one stage. Specifically, a Gaussian kernel-based
grasp representation is introduced to encode the training
samples, embodying the maximum center that possesses
the highest grasp confidence. A receptive field block is
plugged into the bottleneck to improve the model’s feature
discriminability. In addition, pixel-based and channel-based
attention mechanisms are used to construct a multidimensional
attention fusion network to fuse valuable semantic
information, achieved by suppressing noisy features and
highlighting object features. The proposed method is evaluated
on the Cornell, Jacquard, and extended OCID grasp
datasets. The experimental results show that our method
achieves excellent balancing accuracy and running speed
performance. The network gets a running speed of 6ms,
achieving better performance on the Cornell, Jacquard, and
extended OCID grasp datasets with 97.8, 95.6, and 76.4%
accuracy, respectively. Subsequently, an excellent grasp success rate in a physical environment is obtained using
the UR5 robot arm.
Loading