Abstract: In task-oriented grasping, the robot is supposed to manipulate the objects in a task-compatible manner, which is more important but more challenging than just stably grasping. However, most of existing works perform task-oriented grasping only in single object scenes. This greatly limits their practical application in real world scenes, in which there are usually multiple stacked objects with serious overlaps and occlusions. To perform task-oriented grasping in object stacking scenes, in this paper, we firstly build a synthetic dataset named Object Stacking Grasping Dataset (OSGD) for task-oriented grasping in object stacking scenes. Secondly, a Conditional Random Field (CRF) is constructed to model the semantic contents in object regions. The modelled semantic contents can be illustrated as incompatibility of task labels and continuity of task regions. This proposed approach can greatly reduce the interference of overlaps and occlusions in object stacking scenes. To embed the CRF-based semantic model into our grasp detection network, we implement the inference process of CRFs as a RNN so that the whole model, Task-oriented Grasping CRFs (TOG-CRFs) can be trained end to end. Finally, in object stacking scenes, the constructed model can help robot achieve 69.4% success rate for task-oriented grasping.
0 Replies
Loading