PIE: Permutation-Invariant Multi-Entity Evaluation

Published: 2025, Last Modified: 14 Jan 2026CoG 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The evaluation system is devised in online competitive games and sports to assess players' skills. Popular methods Elo, designed for two-player competitive games such as chess and tennis, are based on the Bradley-Terry model and update players' ratings with competition outcomes. Extended methods Trueskill and mElo are proposed for multi-player(team) and two-player intransitive games, respectively. However, existing evaluation methods are constrained in specific situations; for example, mElo is limited to dealing with two-player games, and TrueSkill, Elo can not handle intransitive games. In addition, previous team evaluation methods bake in the assumption that individuals' contributions to team performance are uniformly determined by individual abilities, which does not hold in many situations, such as football with different roles and games with team score being defined as the maximum or minimum of players' gains. In this paper, we address the challenge of evaluating player skill in multi-player (team) competitions. We propose PIE, an online permutation-invariant evaluation model for multi-entity competitions that ensures the predicted winner of a multi-entity match remains invariant to the order of input entities. For multi-team evaluation, PIE enables team ratings to increase monotonically with improvements in individual player ratings. Empirical results of predicting the winner and winning probabilities in real-world games demonstrate that PIE achieves comparable performance in handling the prediction of multiplayer(team) matches with other baselines.
Loading