Human-Inspired Topological Representations for Visual Object Recognition in Unseen Environments

Published: 24 Apr 2024, Last Modified: 07 May 2024ICRA 2024 Workshop on 3D Visual Representations for Robot ManipulationEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Recognition, RGB-D Perception, AI-Enabled Robotics
TL;DR: In a first-of-its-kind attempt, we obtain color regions, analogous to the MacAdam ellipses in human perception, to compute object color embeddings, which are fused with an object unity-based 3D shape descriptor for recognition.
Abstract: Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. Toward this goal, we extend our previous work to propose the TOPS2 descriptor, and an accompanying recognition framework, THOR2, inspired by a human reasoning mechanism, known as object unity. We interleave color embeddings obtained using the Mapper algorithm for topological soft clustering with the shape-based TOPS descriptor to obtain the TOPS2 descriptor. THOR2, trained using synthetic data, achieves substantially higher recognition accuracy than the shape-based THOR framework and outperforms RGB-D ViT on the UW-IS Occluded dataset recorded using commodity hardware. Therefore, THOR2 is a promising step toward achieving robust recognition in low-cost robots
Submission Number: 7
Loading