More Than Eleven Thousand Words: Towards Using Language Models for Robotic Sorting of Unseen Objects into Arbitrary Categories
Keywords: Language Models, Detection, Sorting
TL;DR: This paper explores the combination of a language model and an open vocabulary object detector.
Abstract: We consider the task of automatically sorting previously unseen objects into arbitrary categories. We aim to sort into general, high-level categories in contrast to traditional methods that sort on visually discernible features or by other sensor measurements. This paper explores a method where we divide the categorization into two sub-tasks: object detection and categorization. In a set of experiments, it is shown that splitting the categorization task into a two-stage process removes highly important information for robust categorization and performs less robustly than an open vocabulary object detector. We hope these results are helpful for exploring the limits of Language Models for robotic tasks.
3 Replies
Loading