FindMeIfYouCan: Bringing Open Set Metrics to $\textit{near}$, $\textit{far}$ and $\textit{farther}$ Out-of-Distribution Object Detection

ICLR 2026 Conference Submission16840 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Out-Of-Distribution Detection, Open Set Detection, Object Detection, Benchmark, Computer vision
TL;DR: A rigorous benchmark for OOD object detection, introducing open-set metrics and semantically stratified splits, to precisely evaluate unknown object identification. Our work unmasks safety-critical insights that current metrics fail to reveal.
Abstract: Recently, out-of-distribution (OOD) detection has gained traction as a key research area in object detection (OD), aiming to identify incorrect predictions often linked to unknown objects. In this paper, we reveal critical flaws in the current OOD-OD evaluation protocol: it fails to account for scenarios where unknown objects are ignored since the current metrics (AUROC and FPR) do not evaluate the ability to find unknown objects. Moreover, the current benchmark violates the assumption of non-overlapping objects with respect to in-distribution (ID) classes. These problems question the validity and relevance of previous evaluations. To address these shortcomings, first, we manually curate and enhance the existing benchmark with new evaluation splits---semantically $\textit{near}$, $\textit{far}$, and $\textit{farther}$ relative to ID classes. Then, we integrate established metrics from the open-set object detection (OSOD) community, which, for the first time, offer deeper insights into how well OOD-OD methods detect unknown objects, when they overlook them, and when they misclassify OOD objects as ID---key situations for reliable real-world deployment of object detectors. Our comprehensive evaluation across several OD architectures and OOD-OD methods show that the current metrics do not necessarily reflect the actual localization of unknown objects, for which OSOD metrics are necessary. Furthermore, we observe that semantically and visually similar OOD objects are easier to localize but more likely to be confused with ID objects, whereas $\textit{far}$ and $\textit{farther}$ objects are harder to localize but less prone to misclassification.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Submission Number: 16840
Loading