Occlusion-Aware 6D Pose Estimation with Depth-Guided Graph Encoding and Cross-Semantic Fusion for Robotic Grasping

Jingyang Liu, Zhenyu Lu, Lu Chen, Jing Yang, Chenguang Yang

Published: 2025, Last Modified: 05 Nov 2025ICRA 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Reliable 6D pose estimation is crucial for robotic tasks but presents significant challenges in environments with occlusion. Recent approaches tend to directly predict pose parameters of object with deep neural networks, lacking the modeling ability of non-adjacent and complex relationships of surface points in occluded scenarios. To solve this problem, we propose a novel occlusion-aware 6D pose estimation framework, which uses depth-guided graph neural network (GNN) to model potential relationships from RGBD input. Two semantic information, which are mask and binary code of object, are adaptively fused to extract 2D-3D correspondence related features in an effective manner. Both enhanced graph features and fused semantic information contribute to the performance improvement of pose estimation with occlusion. Extensive experiments indicate that our approach outperforms comparative methods by 1.2% and 1.9% on LMO and YCBV datasets (up to 30% for certain objects) and its validity is also verified under real-world pose estimation test.

External IDs:dblp:conf/icra/LiuLCYY25