The Probability Simplex is Compatible

Simone Ricci; Niccolò Biondi; Federico Pernici; Alberto Del Bimbo

The Probability Simplex is Compatible

Simone Ricci, Niccolò Biondi, Federico Pernici, Alberto Del Bimbo

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Learning, Representation Learning, Compatible Learning, Neural Collapse

TL;DR: In this paper we show that usig softmax outputs and logits as features leads to backward compatible representations between independetly trained models without requiring any additional losses or network modifications.

Abstract: In retrieval systems, updating the base model involves re-extracting feature vectors for all gallery data due to changes in internal feature representations. This process can be computationally expensive and time-consuming, especially for large-scale gallery sets. To address this issue, backward compatible learning was introduced, allowing direct comparison between the representations of the old model and those obtained by the newly trained model. Existing backward compatible methods introduce additional losses or specific network architecture changes, which require the availability of base models, thereby limiting compatibility with models trained independently. In this paper, we show that any independently trained model can be made compatible with any other by simply using features derived from softmax outputs. We leverage the geometric properties of the softmax function, which projects vectors into the Probability Simplex, preserving the alignment of softmax vectors across model updates and verifying the definition of compatibility. A similar property is observed when using logits as a feature representation. They distribute during training in a simplex configuration, but with a wider spread in the feature distribution than softmax outputs, leading to a more robust and transferable representation. Our framework achieves state-of-the-art performance on standard benchmarks, where either the number of training classes extends across multiple steps or the base model is updated with advanced network architectures. This demonstrates that any publicly available pretrained model can be made compatible without requiring any additional training or adaptation. Our code will be made available upon acceptance.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12123

Loading