Resolving Causal Confusion in Reinforcement Learning via Robust ExplorationDownload PDF


09 Mar 2021, 17:17 (modified: 15 Jun 2022, 19:17)SSL-RL 2021 PosterReaders: Everyone
Abstract: A reinforcement learning agent must distinguish between spurious correlations and causal relationships in its environment in order to robustly achieve its goals. Contrary to popular belief, such cases of causal confusion {\em can} occur in online reinforcement learning (RL) settings. We demonstrate this, and show how causal confusion can lead to catastrophic failure under even mild forms of distribution shift. We formalize the problem of identifying causal structure in a Markov Decision Process, and highlight the central role played by the data collection policy in identifying and avoiding spurious correlations. We find that under insufficient exploration, many RL algorithms, including those with PAC-MDP guarantees, fall prey to causal confusion under insufficient exploration policies. To address this, we present a robust exploration strategy which enables causal hypothesis-testing by interaction with the environment. Our method outperforms existing state-of-the-art approaches at avoiding causal confusion, improving robustness and generalization in a range of tasks.
0 Replies