Keywords: representation learning, representation similarity, RSA, CKA, SVCCA, z-scoring, standardization, evaluation protocol, transfer learning, text encoders, vision encoders
Abstract: Similarity analyses of learned representations often produce different rankings across popular measures, which complicates comparison and reuse. We test whether a minimal and fully specified preprocessing step can reconcile these outcomes. Using three vision encoders and six text encoders on public datasets, we evaluate representation similarity analysis, linear centered kernel alignment, and singular vector canonical correlation analysis under raw features and under per feature z scoring across stimuli. On text encoders, standardization raises cross measure ranking agreement, for example Kendall tau between representation similarity analysis and centered kernel alignment increases from 0.64 to 0.89 for CLS pooled vectors, and between representation similarity analysis and canonical correlation analysis from 0.58 to 0.83. Mean pooled vectors already agree strongly and show smaller gains. In vision, heatmaps reveal that representation similarity analysis is sensitive to standardization while the other measures remain stable. A linear transfer probe on text shows positive associations between similarity and the negative of prediction error. An orthogonal transform control leaves centered kernel alignment unchanged, consistent with theory. These results support a simple reporting standard: state and apply dataset wise centering and variance scaling when comparing representations, since this improves agreement across measures and clarifies links to transfer.
Submission Number: 134
Loading