Learning Semantic Proto-concepts for Open-world Object Detection
Abstract: Detecting out-of-distribution (OOD) objects is pivotal for robust visual perception in real-world environments. While contemporary methods rely on feature-space regularization using synthesized unknowns, they inherently fail to model authentic unseen data distributions. We introduce ODE (Object Detection Enhancer), a novel framework enabling detectors to intrinsically recognize novel objects by learning holistic scene representations. Our approach leverages two synergistic innovations: First, global semantic extraction abstracts entire images into condensed spatial-concept prototypes through contrastive scene factorization, dynamically encoding relationships between known and unknown entities without supervision. Second, adaptive conceptual binding fuses these prototypes with detector features via attention-based alignment, suppressing model overconfidence on OOD instances through category-aware attenuation. During deployment, we devise a contextual anomaly metric that quantifies deviation from learned in-distribution (ID) concepts to refine object confidence. Evaluations across MS-COCO, PASCAL-VOC, and BDD100K benchmarks confirm ODE’s superiority: it reduces FPR95 by 13.6% and improves AUROC by 3.8% over state-of-the-art alternatives, establishing new performance standards in open-world detection scenarios.
Loading