offline_rl_ope: A Python package for off-policy evaluation of offline RL models with real world data

Joshua William Spear; Matthieu Komorowski; REBECCA POPE; Neil J Sebire

offline_rl_ope: A Python package for off-policy evaluation of offline RL models with real world data

Joshua William Spear, Matthieu Komorowski, REBECCA POPE, Neil J Sebire

25 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Offline RL, OPE, Python, PyTorch

TL;DR: Python package for performing off-policy evaluation of offline RL models, focused on real world applications.

Abstract: offline_rl_ope is a fully unit tested and runtime type checked Python package for performing off-policy evaluation of offline RL models. offline_rl_ope has been designed for OPE workflows using real world data by: naturally handling uneven trajectory lengths; including novel convergence metrics which do not rely on OPE estimator ground truths; and providing a compute and data efficient API which can be integrated with many offline RL frameworks. This paper motivates and describes the core API design and functionality to enable ease of use and extension. The implementations of OPE methods have been benchmarked against existing implementations to ensure consistency and reproducibility. The offline_rl_ope source code can be found on GitHub at: REDACTED.

Primary Area: infrastructure, software libraries, hardware, systems, etc.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4583

Loading