Reproducing and Extending Counterfactual Data Augmentation: A Study on Causal Identifiability and Stability in Reinforcement Learning
Keywords: Causal Reinforcement Learning, Counterfactual Data Augmentation, Offline Reinforcement Learning, Structural Causal Models, Robustness
TL;DR: This work presents a ground-up reproduction of the CTRL framework, evaluating counterfactual data augmentation via a multi-factor validation matrix in CartPole-SD and extending external-validity testing to LunarLander, MuJoCo, and D4RL environments.
Abstract: We present a reproducibility-focused reimplementation and extension of CTRL, a causal reinforcement learning method based on counterfactual data augmentation. Beyond reproducing CartPole-SD, we run a controlled validation matrix over counterfactual fraction, noise level, dataset size, and generator quality, then test transfer to LunarLander, MuJoCo, and D4RL-style offline settings. The empirical pattern is consistent across runs: counterfactual augmentation is conditionally useful, not uniformly superior. In CartPole, it can improve clean returns, especially with larger datasets and stronger generators, but noisy-evaluation gains are modest. In cross-domain settings, outcomes are mixed and currently budget-limited. Our claim is therefore scoped: counterfactual augmentation can help offline RL in specific regimes, but reliability depends on data regime, generator fidelity, and evaluation protocol. By introducing a comparative analysis against a non-causal Base-S world model, we identify a critical 'coverage-versus-bias' tradeoff where excessive augmentation can amplify transition inaccuracies, a failure mode particularly evident in balance-heavy tasks. We further demonstrate that Bellman-score selection is insufficient to overcome these biases in high-variance regimes. Finally, we fill a significant gap in the community by providing a verified, ground-up open-source implementation of the CTRL architecture to facilitate further research in causal RL.
Paper Type: Full (minimum of 10 pages and a maximum of 16 excluding references)
Poster Opt In: Yes, I'm open to having my submission accepted as a poster (leave blank if you are submitting a poster, or if you DON'T want your submission to be accepted as a poster instead of a full or short paper)
Supplementary Material: zip
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 7
Loading