Investigating the Link Between Representational Similarity and Model Interactions

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Representational similarity, multi-agent system, cooperation, creativity
TL;DR: We examine whether representational similarity can predict the interactive behaviors of models.
Abstract: Researchers have shown that neural similarity among humans predicts social closeness and cooperative success, whereas innovation often emerges from interactions among dissimilar individuals. We investigate whether these principles extend to artificial intelligence by examining interactions between large language models. In our experiments, 276 model pairs interact across eight games spanning both cooperation and novelty. We find that pairs with more similar representation spaces achieve significantly higher cooperation but exhibit reduced novelty and creativity. The effects of representational similarity on cooperation and novelty remain robust even after isolating other factors such as performance disparity and model size. We also find that similarity in the early layers consistently exhibits the strongest effect across games, compared to the middle and later layers. This suggests that a central factor underlying the observed trend is the extent to which the two models share lexical and semantic grounding. These findings suggest that representational similarity can be an important consideration in multi-agent system design.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 19896
Loading