Keywords: Intrinsic dimension, universality of representations
TL;DR: Intrinsic dimension of latent representations of data is class-dependent and this dependency is universal across models.
Abstract: While state-of-the-art transformer networks use several hundreds of latent variables per layer, it has been shown that these features can actually be represented by relatively low dimensional manifolds. The intrinsic dimension is a geometrical property of the manifold latent representations populate, viz., a minimal number of parameters needed to describe the representations.
In this work, we compare the intrinsic dimensions of three image transformer networks for classes of the cifar10 and cifar100 dataset. We find compelling evidence that the intrinsic dimensions differ among classes but are universal across networks. This universality persists across different pretraining strategies, fine-tuning and different model sizes. Our results strengthen the hypothesis that different models learn similar representations of data and show great potential that further investigation of intrinsic dimension could lead to more insights on the universality of latent representations.
Track: Extended Abstract Track
Submission Number: 48
Loading