FactoredRL: Leveraging Factored Graphs for Deep Reinforcement Learning

Bharathan Balaji; Petros Christodoulou; Xiaoyu Lu; Byungsoo Jeon; Jordan Bell-Masterson

FactoredRL: Leveraging Factored Graphs for Deep Reinforcement Learning

Bharathan Balaji, Petros Christodoulou, Xiaoyu Lu, Byungsoo Jeon, Jordan Bell-Masterson

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: reinforcement learning, factored mdp, factored rl

Abstract: We propose a simple class of deep reinforcement learning (RL) methods, called FactoredRL, that can leverage factored environment structures to improve the sample efficiency of existing model-based and model-free RL algorithms. In tabular and linear approximation settings, the factored Markov decision process literature has shown exponential improvements in sample efficiency by leveraging factored environment structures. We extend this to deep RL algorithms that use neural networks. For model-based algorithms, we use the factored structure to inform the state transition network architecture and for model-free algorithms we use the factored structure to inform the Q network or the policy network architecture. We demonstrate that doing this significantly improves sample efficiency in both discrete and continuous state-action space settings.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: The paper proposes a method to decompose the relationships between states, actions and rewards using known structural information to improve sample efficiency of deep reinforcement learning algorithms.

Reviewed Version (pdf): https://openreview.net/references/pdf?id=cV5fZVZpIo

11 Replies

Loading