Ground4Act: Leveraging visual-language model for collaborative pushing and grasping in clutter

Published: 01 Jan 2024, Last Modified: 13 Nov 2024Image Vis. Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Ground4Act is developed via visual grounding for target-oriented tasks in clutter.•DQN-based policy achieves pushing of non-target objects to make the target graspable.•Push and grasp in a uniform format for common deployment from simulation to reality.
Loading