A REINFORCEMENT LEARNING FRAMEWORK FOR TIME DEPENDENT CAUSAL EFFECTS EVALUATION IN A/B TESTING

Chengchun Shi; Xiaoyu Wang; Shikai Luo; Rui Song; Hongtu Zhu; Jieping Ye

A REINFORCEMENT LEARNING FRAMEWORK FOR TIME DEPENDENT CAUSAL EFFECTS EVALUATION IN A/B TESTING

Chengchun Shi, Xiaoyu Wang, Shikai Luo, Rui Song, Hongtu Zhu, Jieping Ye

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: reinforcement learning, A/B testing, causal inference, sequential testing

Abstract: A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. The aim of this paper is to introduce a reinforcement learn- ing framework for carrying A/B testing in two-sided marketplace platforms, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating. It is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties (e.g., size and power) of our testing procedure. Finally, we apply our framework to both synthetic data and a real-world data example obtained from a technological company to illustrate its advantage over the current practice.

One-sentence Summary: We introduce a reinforcement learning framework to evaluate time dependent causal effects in A/B testing.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/a-reinforcement-learning-framework-for-time/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=5NxrdV2_Zp

8 Replies

Loading