Empirical Spatial Error Bounds for Reliable Semantic Segmentation of Pedestrians and Riders

Timo Bartels, Malte Stelzer, Jan Bickerdt, Volker Schomerus, Jan Piewek, Thorsten Bagdonat, Tim Fingscheidt

Published: 2025, Last Modified: 26 May 2026IV 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The mean intersection over union (mloU) is a standard metric for evaluating semantic segmentation models. While steady improvements in mloU have been achieved on automotive benchmarks like Cityscapes, their impact on reliably detecting vulnerable road users, such as pedestrians and riders, remains unclear. This study empirically analyzes 167 semantic segmentation models w.r.t. the spatial distribution of the false positive rate and false negative rate in the Cityscapes dataset. Our analysis reveals that many segmentation errors occur at object contours, which hardly influence driving decisions and road user safety. Accordingly, we propose to exclude such irrelevant errors. We define spatial error bounds within which models reliably detect pedestrians and riders. Since time-to-collision is strongly related to distance, and a vertical pixel position is roughly related to distance, the vertical position of segmentation errors provides an effective way to evaluate the reliability of semantic segmentation models on an entire dataset. Our evaluation of such empirical spatial error bounds reveals that strong models (w.r.t. mloU) are related to an improved detection of existing pedestrians (false negative rate, FNR). On the other hand, mloU in general is only weakly related to hallucinations of pedestrians and riders (false positive rate, FPR). Some models even exhibit a higher FPR despite having a 11.2% absolute higher mloU.

External IDs:dblp:conf/ivs/BartelsSBSPBF25