Similarity-Distance-Magnitude Activations

TMLR Paper4906 Authors

21 May 2025 (modified: 17 Nov 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We introduce the Similarity-Distance-Magnitude (SDM) activation function, a more robust and interpretable formulation of the standard softmax activation function, adding Similarity (i.e., correctly predicted depth-matches into training) awareness and Distance-to-training-distribution awareness to the existing output Magnitude (i.e., decision-boundary) awareness, and enabling interpretability-by-exemplar via dense matching. We further introduce the SDM estimator, based on a data-driven partitioning of the class-wise empirical CDFs via the SDM activation, to control the class- and prediction-conditional accuracy among selective classifications. When used as the final-layer activation over pre-trained language models for selective classification, the SDM estimator is more robust to co-variate shifts and out-of-distribution inputs than existing calibration methods using softmax activations, while remaining informative over in-distribution data.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: For reference, in the Appendix we have added a comparison to variational Bayesian last-layer neural networks. In this way, we compare to representative estimators from Bayesian, Frequentist, and empirically motivated perspectives.
Assigned Action Editor: ~Jasper_Snoek1
Submission Number: 4906
Loading