Keywords: Competitive Games, Model-Based Reinforcement Learning, Core State Extraction, Variational Inference
Abstract: Recent reinforcement learning agents perform excellently by generating an internal representation of information crucial for predicting outcomes through huge experience, but do not clarify what essential information (core) is extracted in their representation. A model-based reinforcement learning algorithm, Goal-Oriented Environment Inference (GOEI), has been proposed to solve this issue, and its ability to explicitly learn such core states has been demonstrated in an abstract environment. Here, we validated the ability of GOEI in a more realistic environment, i.e., a competitive card game “Hol’s der Geier (The Vulture Gets It).” To our surprise, it achieves a nearly optimal strategy equivalent to the Nash equilibrium by using core states reduced only to 2.9\% (452 states) of all possible observations (15,542). These results demonstrate that GOEI effectively excludes information irrelevant to game outcomes, thereby significantly reducing the memory burden.
Primary Area: reinforcement learning
Submission Number: 8890
Loading