Learning representations on Lp hyperspheres: The equivalence of loss functions in a MAP approach

Nicolas Michel; Jean-François BERCHER; Toshihiko Yamasaki

Learning representations on Lp hyperspheres: The equivalence of loss functions in a MAP approach

Nicolas Michel, Jean-François BERCHER, Toshihiko Yamasaki

Published: 23 Sept 2025, Last Modified: 28 Nov 2025NeurReps 2025 ProceedingsEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Representation Learning, Hypersphere, Maximum A Posteriori.

Abstract: A common practice when training Deep Neural Networks is to force the learned representations to lie on the standard unit hypersphere, with respect to the $L_2$ norms. Such practice has been shown to improve both the stability and final performances of DNNs in many applications. In this paper, we derive a unified theoretical framework for learning representation on any $L_p$ hyperspheres for classification tasks, based on Maximum A Posteriori (MAP) modeling. Specifically, we give an expression of the probability distribution of multivariate Gaussians projected on any $L_p$ hypersphere and derive the general associated loss function. Additionally, we show that this framework demonstrates the theoretical equivalence of all projections on $L_p$ hyperspheres through the MAP modeling. It also provides a new interpretation of traditional Softmax Cross Entropy with temperature (SCE-$\tau$) loss functions. Experiments on standard computer vision datasets give an empirical validation of the equivalence of projections on $L_p$ unit hyperspheres when using adequate objectives. It also shows that the SCE-$\tau$ on projected representations, with optimally chosen temperature, shows comparable performances.

Submission Number: 35

Loading