Learning Environment Models with Continuous Stochastic Dynamics - with an Application to Deep RL Testing

Martin Tappler; Edi Muskardin; Bernhard K. Aichernig; Bettina Könighofer

Learning Environment Models with Continuous Stochastic Dynamics - with an Application to Deep RL Testing

Martin Tappler, Edi Muskardin, Bernhard K. Aichernig, Bettina Könighofer

Published: 01 Jan 2024, Last Modified: 15 May 2025ICST 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Techniques like deep reinforcement learning (DRL) enable autonomous agents to solve tasks in complex environments automatically through learning. Despite their potential, neural-network-based decision-making policies are hard to understand and test. To ease the adoption of such techniques, we learn automata models of environmental behavior under the control of an agent. These models provide insights into the decisions faced by agents and a basis for testing. To scale automata learning to environments with complex and continuous dynamics, we compute an abstract state-space representation through dimensionality reduction and clustering of observed environmental states. The stochastic transitions are learned via passive automata learning from agent-environment interactions. Furthermore, we iteratively sample additional tra-jectories to enhance the learned model's accuracy. We demonstrate the potential of our automata learning frame-work by (1) solving popular RL benchmark problems and (2) applying it for differential testing of DRL agents. Our results show that the learned models are sufficiently precise to compute policies that solve the respective control tasks. Yet the models are sufficiently general for coverage-guided testing, where we reveal significant differences in the functional failure frequency of pairs of DRL agents.

Loading