Spatio-Temporal Abstractions in Reinforcement Learning Through Neural EncodingDownload PDF

29 Mar 2024 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone
Abstract: Recent progress in the field of Reinforcement Learning (RL) has enabled to tackle bigger and more challenging tasks. However, the increasing complexity of the problems, as well as the use of more sophisticated models such as Deep Neural Networks (DNN), impedes the understanding of artificial agents behavior. In this work, we present the Semi-Aggregated Markov Decision Process (SAMDP) model. The purpose of the SAMDP modeling is to describe and allow a better understanding of complex behaviors by identifying temporal and spatial abstractions. In contrast to other modeling approaches, SAMDP is built in a transformed state-space that encodes the dynamics of the problem. We show that working with the \emph{right} state representation mitigates the problem of finding spatial and temporal abstractions. We describe the process of building the SAMDP model from observed trajectories and give examples for using it in a toy problem and complicated DQN policies. Finally, we show how using the SAMDP we can monitor the policy at hand and make it more robust.
TL;DR: A method for understanding and improving deep agents by creating spatio-temporal abstractions
Conflicts: technion.ac.il
Keywords: Reinforcement Learning, Deep learning
10 Replies

Loading