Toward a Plug-and-Play Vision-Based Grasping Module for Robotics

François Hélénon; Johann Huber; Faïz Ben Amar; Stephane Doncieux

Toward a Plug-and-Play Vision-Based Grasping Module for Robotics

François Hélénon, Johann Huber, Faïz Ben Amar, Stephane Doncieux

Published: 16 Apr 2024, Last Modified: 02 May 2024MoMa WS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Grasping, Software-Hardware Integration for Robot Systems

Abstract: Despite recent advancements in AI for robotics, grasping remains a partially solved challenge. The lack of benchmarks and reproducibility prevents the development of robots that can interact with open environments autonomously. The generalizing capabilities of foundation models are promising, but the computational cost is very high, and the adaptation capabilities demonstrated on real robots are still limited. This paper takes an opposite perspective by introducing a vision-based grasping framework that can easily be transferred across multiple manipulators. Leveraging Quality-Diversity (QD) algorithms, the framework generates diverse repertoires of open-loop grasping trajectories, enhancing adaptability while maintaining a diversity of grasps. This framework addresses two main issues: the lack of an off-the-shelf vision module for detecting object pose and the generalization of QD trajectories to the whole robot operational space. The proposed solution combines multiple vision modules for 6DoF object detection and tracking while rigidly transforming QD-generated trajectories into the object frame. Experiments on a Franka Research 3 arm and a UR5 arm with an SIH Schunk hand demonstrate comparable performance when the real scene aligns with the simulation used for grasp generation. This work represents a significant stride toward building a reliable vision-based grasping module that is transferable to new manipulator platforms and adaptable to diverse scenarios without further training iterations.

Submission Number: 5

Loading