Mahalanobis++: Improving OOD Detection via Feature Normalization

Maximilian Müller; Matthias Hein

Mahalanobis++: Improving OOD Detection via Feature Normalization

Maximilian Müller, Matthias Hein

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We show that the Mahalanobis distance estimation is degraded by strong variations in the feature norm and provide a simple fix (projection to the unit sphere) that consistently improves the method and leads to new SOTA results.

Abstract: Detecting out-of-distribution (OOD) examples is an important task for deploying reliable machine learning models in safety-critial applications. While post-hoc methods based on the Mahalanobis distance applied to pre-logit features are among the most effective for ImageNet-scale OOD detection, their performance varies significantly across models. We connect this inconsistency to strong variations in feature norms, indicating severe violations of the Gaussian assumption underlying the Mahalanobis distance estimation. We show that simple $\ell_2$-normalization of the features mitigates this problem effectively, aligning better with the premise of normally distributed data with shared covariance matrix. Extensive experiments on 44 models across diverse architectures and pretraining schemes show that $\ell_2$-normalization improves the conventional Mahalanobis distance-based approaches significantly and consistently, and outperforms other recently proposed OOD detection methods.

Lay Summary: In critical applications like healthcare or autonomous driving, it is important that AI systems can tell when they’re seeing something unfamiliar — what researchers call "out-of-distribution" (OOD) data. If a model cannot do this, it might make confident but dangerously wrong predictions. One popular way to detect unfamiliar images uses a mathematical technique called Mahalanobis distance. In practice, this method works very well for some AI models, but surprisingly poorly for others. We investigated why — and found that the inconsistency is due to the way these models represent images internally, especially how large or small their feature values are. Our solution is surprisingly simple: by adjusting the internal image features to all have the same size — a technique called $\ell_2$-normalization — we tackle this problem. We show that $\ell_2$-normalization makes the models' behavior more predictable and greatly improves their ability to spot unfamiliar data. We tested this across 44 models and found our method consistently outperforms existing approaches. The code is freely available online.

Link To Code: https://github.com/mueller-mp/maha-norm/

Primary Area: Deep Learning->Robustness

Keywords: OOD detection, Mahalanobis distance, out-of-distribution detection

Submission Number: 12770

Loading