Model-free Causal Reinforcement Learning with Causal DiagramsDownload PDF

Published: 16 Jun 2023, Last Modified: 21 Jun 2023IJCAI 2023 Workshop KBCG OralReaders: Everyone
Keywords: Causal reinforcement learning, High-level action model, action space generalization, Value decomposition network, Markov Decision Process with Unobserved Confounders
TL;DR: We demonstrate how to utilize causal diagrams, which may contain bidirectional arcs, to design model-free deep Q-learning-based agents by decomposing the value function per causal variable.
Abstract: We present a new model-free causal reinforcement learning approach that utilizes the structure of causal diagrams, which could be learned during causal representation learning and causal discovery. Unlike the majority of approaches in causal reinforcement learning that focus on model-based approaches and off-policy evaluations, we explore another direction: online model-free methods. We achieve this by extending a causal sequential decision-making formulation with factored Markov decision process (FMDP) and MDP with unobserved confounders (MDPUC), and by incorporating the concept of action as intervention. The choice of extending MDPUC addresses the issue of bidirectional arcs in learned causal diagrams. The action as intervention idea allows for the incorporation of high-level action models into the action space in an RL environment as a vector of interventions to the causal variables. We also present a value decomposition method and utilize the value decomposition network architecture popular in multi-agent reinforcement learning, showing encouraging preliminary evaluation results.
0 Replies

Loading