X-Mahalanobis: Transformer Feature Mixing for Reliable OOD Detection

Tong Wei; Bo-Lin Wang; Jiang-Xin Shi; Yu-Feng Li; Min-Ling Zhang

X-Mahalanobis: Transformer Feature Mixing for Reliable OOD Detection

Tong Wei, Bo-Lin Wang, Jiang-Xin Shi, Yu-Feng Li, Min-Ling Zhang

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: out-of-distribution detection, long-tail learning

TL;DR: This paper presents a simple OOD detection approach by mixing multi-layer representations of Transformer.

Abstract: Recognizing out-of-distribution (OOD) samples is essential for deploying robust machine learning systems in open-world environments. While conventional OOD detection approaches rely on feature representations from the penultimate layer of neural networks, they often overlook informative signals embedded in intermediate layers. In this paper, we present a straightforward feature mixing approach for pre-trained Transformers, which combines multi-layer representations via calculated importance weights, and identifies OOD samples using Mahalanobis distance in the blended feature space. When in-distribution samples are accessible, we show that parameter-efficient fine-tuning strategies effectively balance classification accuracy and OOD detection performance. We conduct extensive empirical analyses to validate the superiority of our proposed method under zero-shot, and fine-tuning settings using both class-balanced and long-tailed datasets. The source code is available at https://github.com/SEUML/X-Maha.

Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)

Submission Number: 11625

Loading