Keywords: Vision Language Models, Manipulation Planning, Path-following MPC
Abstract: Language-based robot control is a powerful and versatile method to control a robot manipulator where large language models (LLMs) are used to reason about the environment. However, the generated robot motions by these controllers often lack safety and performance, resulting in jerky movements. In this work, a novel modular framework for zero-shot motion planning for manipulation tasks is developed. The modular components do not require any motion-planning-specific training. An LLM is combined with a vision model to create Python code that interacts with a novel path planner, which creates a piecewise linear reference path with bounds around the path that ensure safety. An optimization-based planner, the BoundMPC framework, is utilized to execute optimal, safe, and collision-free trajectories along the reference path. The effectiveness of the approach is shown on various everyday manipulation tasks in simulation and experiment, shown in the video at www.acin.tuwien.ac.at/42d2.
Video: https://www.youtube.com/watch?v=phPNaDWIe9I, https://www.acin.tuwien.ac.at/42d2, https://www.acin.tuwien.ac.at/42d3
Publication Agreement: pdf
Student Paper: yes
Spotlight Video: mp4
Supplementary Material: zip
Submission Number: 211
Loading