Wearable Robot Control Method Based on Vision-Language Models

Published: 05 Apr 2024, Last Modified: 14 Apr 2024VLMNM 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Visual Language Model(VLM), Human-Robot Interaction(HRI), weight estimation, wearable robot control
TL;DR: A novel approach for controlling wearable assistive devices using VLMs and "training-less" datasets
Abstract: This study introduces a novel application of Vision-Language Models (VLMs) in the field of human-robot interaction, specifically in controlling wearable robots with only a single camera setup. Thanks to the pre-trained knowledge of VLMs, our approach can estimate the weight of grasped objects in an industrial setup and accordingly adjust the robot's assistance mode. We detail the methodology of the control framework, including prompts that outline the hand gesture detection, identification of grasped objects, weight estimation, training-less datasets, and the response format for the robot control. This allows the system to adapt to a specific user environment without the need for extensive dataset collection, model training, or fine-tuning. Our method has been demonstrated in real-world applications with the wearable robot to confirm its feasibility.
Supplementary Material: zip
Submission Number: 30
Loading