A Reinforcement Learning Approach to Estimating Long-term Treatment Effects

Ziyang Tang; Yiheng Duan; Stephanie Zhang; Lihong Li

A Reinforcement Learning Approach to Estimating Long-term Treatment Effects

Ziyang Tang, Yiheng Duan, Stephanie Zhang, Lihong Li

Published: 01 Feb 2023, Last Modified: 12 Oct 2025Submitted to ICLR 2023Readers: Everyone

Keywords: reinforcement learning, off-policy evaluation, A/B testing

TL;DR: We propose a reinforcement learning approach to estimating the long-term treatment effect from short-term data.

Abstract: Randomized experiments (a.k.a. A/B tests) are a powerful tool for estimating treatment effects, to inform decisions making in business, healthcare and other applications. In many problems, the treatment has a lasting effect that evolves over time. A limitation with randomized experiments is that they do not easily extend to measure long-term effects, since running long experiments is time-consuming and expensive. In this paper, we take a reinforcement learning (RL) approach that estimates the average reward in a Markov process. Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems, and demonstrate promising results in two synthetic datasets and one online store dataset.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/a-reinforcement-learning-approach-to/code)

14 Replies

Loading