Visual Instruction as an Intuitive Interface for Robotic Sorting

Published: 01 Jan 2025, Last Modified: 12 Nov 2025HCI (57) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Human-Robot Interaction (HRI) is a rapidly evolving field that seeks to bridge the gap between human users and increasingly capable robotic systems. While robotic technology has advanced significantly, designing intuitive and accessible interaction methods remains a key challenge. Traditional paradigms, such as programming-based interfaces or text-based commands, often impose steep learning curves and hinder broader adoption. While recent advancements in Large Language Models (LLMs) and Vision-Language Models (VLMs) have enabled natural language interaction, these approaches are prone to ambiguity, particularly when specifying spatial relationships.
Loading