Keywords: algorithm evaluation, cognitive diagnosis, reinforcement learning, game theory
Abstract: Driven by advances in deep learning, intelligent game-theoretic strategies have progressed rapidly. As these methods enter pivotal real-world domains, rigorous and impartial evaluation has emerged as a central concern. Although many benchmarks now ensure the comparability of diverse gaming algorithms, prevailing evaluation methods focus on macro-level win–loss outcomes and aggregate metrics, providing little insight into differences among strategies across dimensions of game capabilities. In addition, existing methods are limited to evaluation within a single environment and struggle to handle evaluation tasks under environment shift. In this work, we propose MGCE (Multi-dimensional Game-theoretic Capability Evaluation), a novel evaluation framework that first partitions the scenarios and embeds them using large-model-guided multi-modal features. It then leverages neural networks to model strategy–environment interactions, quantifying multi-dimensional capabilities and characteristics of the environment. MGCE further incorporates a fine-tuning module to predict the performance of game strategies under shifting environments. The experimental results demonstrate that the multi-dimensional game capabilities learned by MGCE facilitate interpretable analysis of strategies, and that MGCE outperforms competing baselines in performance prediction across both private and public environments.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 13165
Loading