Reward-Adaptive Iterative Discovery: A Case Study on Automated Game Testing for NHL26

Published: 14 Jun 2026, Last Modified: 14 Jun 2026RLVG Workshop 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Automated playtesting, Reinforcement Learning, Diversity
TL;DR: We introduce a method for discovering diverse exploits in game testing and demonstrate it on the game NHL26.
Abstract: Testing is a major effort for the gaming industry, requiring a significant part of development budget and people power. We present a case study on a development version of the ice hockey game EA SPORTS NHL 26, for which human playtesters test the goalie AI for behavioral exploits. To reduce the effort of re-testing the goalie AI after every game or behavior modification in the development phase, we propose Reward-Adaptive Iterative Discovery (RAID), a novel approach to automatically find exploits using an iterative Reinforcement Learning (RL) approach that trains a population of goal scoring agents. While previous approaches can already successfully find exploits, RL algorithms tend to overfit to a single solution. We introduce a simple extension on top of existing RL algorithms, such that they find multiple diverse high-quality solutions. For our first deployment of this approach, within a single experiment we were able to find six hockey scoring exploit strategies that were qualitatively similar to those that playtesters had found in hours-long manual testing sessions.
Submission Number: 8
Loading