Keywords: Out-of-Distribution Object Detection, Input Distance Awareness
TL;DR: We uncover which transformer layers best reveal distance and harness them to create simple, state-of-the-art detectors for out-of-distribution objects.
Abstract: Out-of-distribution object detection (OOD-OD) is essential for building robust vision systems in safety-critical applications. While transformer-based architectures have become dominant in object detection, existing work on OOD-OD has primarily focused on OOD object synthesis or OOD detection scores, with limited understanding of the internal feature representations of transformers. In this work, we present the first in-depth analysis of transformer features for OOD-OD. Motivated by theoretical insights that input distance awareness – the ability of feature representations to reflect the distance from the training distribution – is a key property for predictive uncertainty estimation and reliable OOD detection, we systematically evaluate this property across transformer layers. Our analysis reveals that certain transformer layers exhibit heightened input distance awareness. Leveraging this observation, we develop simple yet effective OOD detection methods based on features from these layers, achieving state-ofthe-art performance across multiple OOD-OD benchmarks. Our findings provide new insights into the role of transformer representations in OOD detection. Code and additional experiments are in the Supp.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 9211
Loading