Object Pose Estimation for Robotic Grasping based on Multi-view Keypoint Detection

Zheyuan Hu, Renluan Hou, Jianwei Niu, Xiaolong Yu, Tao Ren, Qingfeng Li

Published: 2021, Last Modified: 14 Nov 2024ISPA/BDCloud/SocialCom/SustainCom 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Industrial robots can replace human labour to perform a variety of tasks. Among these tasks, robotic grasping is the most primary industrial robot operation. However, conventional robotic grasping methods could become inapplicable for cluttered and occluded objects. To address the issue, we adopt object pose estimation (OPE) to facilitate robotic grasping of cluttered and occluded objects and propose an object detection model based on 2D-RGB multi-view features. The proposed model is built by adding four transpose convolution layers into the Resnet backbone to obtain desirable 2D feature maps of object keypoints in each image. In addition, we design a feature-fusion model to produce 3D coordinates of keypoints from 2D multi-view features based on the volumetric aggregation method, along with a keypoint-detection confidence of each view to assist the optimality judgment of the robotic grasping. Extensive experiments are conducted to verify the accuracy of OPE, and the experimental results indicate the substantial performance improvements of the proposed approach over conventional methods in various scenarios.