Keywords: Human-Object Interaction, Human-Robot Collaboration, Human Intention
TL;DR: A deep learning framework for recognizing human intention through the anticipation of human-object interaction and its implementation in robot assistive tasks.
Abstract: Robots are becoming increasingly integrated into our lives, assisting us in various tasks. To ensure effective collaboration between humans and robots, it is essential that they understand our intentions and anticipate our actions. In this paper, we propose a Human-Object Interaction (HOI) anticipation framework for collaborative robots. We propose an efficient and robust transformer-based model to detect and anticipate HOIs from videos. This enhanced anticipation empowers robots to proactively assist humans, resulting in more efficient and intuitive collaborations. Our model outperforms state-of-the-art results in HOI detection and anticipation in VidHOI dataset with an increase of 1.76% and 1.04% in mAP respectively while being 15.4 times faster. We showcase the effectiveness of our approach through experimental results in a real robot, demonstrating that the robot's ability to anticipate HOIs is key for better Human-Robot Interaction.
Student First Author: yes
Supplementary Material: zip
Instructions: I have read the instructions for authors (https://corl2023.org/instructions-for-authors/)
Publication Agreement: pdf
Poster Spotlight Video: mp4