Abstract: Accurate object 6D pose estimation is a fundamental problem in industrial bin-picking scenes, which is challenging due to heavy occlusion under a dense pile of industrial parts. Some recent works follow a two-stage manner, which first densely regress one kind of rotation representations and then utilize simple average aggregation operation for pose estimation. However, these methods suffer from nonlinearity of the rotation space and are sensitive to inaccurate predictions. Instead, we propose a point-wise line voting network (PLVNet) to regresses point-wise offset vectors pointing to the axis lines, together with a RANSAC-based line fitting approach to aggregate these dense predictions. This transforms the rotation component into two perpendicular vectors and filters outlier results during axis line aggregation process. Meanwhile, a modified 3D U-Net is introduced to segment the object instances of interest from the entire scene, which explicitly casts visibility prediction into binary classification problem. We conduct several experiments on the public dataset and real-world environment and the results reveal that the proposed method can effectively estimate the poses of objects in industrial bin-picking scenes.
External IDs:dblp:journals/vc/WangYZWWZZX25
Loading