Keywords: Object Detection, Open-World Object Detection, Unknown Object Categorization
TL;DR: TARO is an open-world object detector that categorizes unknown objects into coarse semantic classes rather than assigning a generic "Unknown" label.
Abstract: Modern object detectors are largely confined to a "closed-world" assumption, limiting them to a predefined set of classes and posing risks when encountering novel objects in real-world scenarios. While open-set detection methods aim to address this by identifying such instances as *Unknown*, this is often insufficient. Rather than treating all unknowns as a single class, assigning them more descriptive subcategories can enhance decision-making in safety-critical contexts. For example, identifying an object as an *Unknown Animal* (requiring an urgent stop) versus *Unknown Debris* (requiring a safe lane change) is far more useful than just *Unknown* in autonomous driving. To bridge this gap, we introduce TARO, a novel detection framework that not only identifies unknown objects but also classifies them into coarse parent categories within a semantic hierarchy. TARO employs a unique architecture with a sparsemax-based head for modeling objectness, a hierarchy-guided relabeling component that provides auxiliary supervision, and a classification module that learns hierarchical relationships. Experiments show TARO can categorize up to 29.9% of unknowns into meaningful coarse classes, significantly reduce confusion between unknown and known classes, and achieve competitive performance in both unknown recall and known mAP. Code is available at: https://anonymous.4open.science/r/TARO
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 16712
Loading