Dynamics Models for Offline Hyperparameter Selection in Water Treatment

Published: 13 Jun 2025, Last Modified: 27 Jun 2025RL4RS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Applied RL, Dynamics Models, Hyperparameter Selection, Water Treatment, Industrial Control
Abstract: A key barrier to deploying reinforcement learning algorithms in real-world systems is the challenge of hyperparameter selection, particularly when simulators are unavailable or online interactions are costly. Recent work has proposed using calibration models trained on offline data to simulate environment interactions and support offline hyperparameter selection, but prior applications have been restricted to simple, simulated domains. In this paper, we present the first application of calibration models to a real-world industrial setting: the Drayton Valley water treatment plant in Alberta, Canada. We evaluate several calibration models, including a k-nearest neighbors model with a Laplacian distance metric, on high-dimensional, non-stationary sensor data for nexting prediction tasks. We demonstrate that these models can generate realistic long-horizon rollouts and recover meaningful hyperparameter sensitivity curves. We also explore how calibration models scale to year-long datasets, how they help select fine-tuning learning rates for pre-trained agents, and how they perform under distribution shifts. Our results provide a proof of concept for using offline dynamics models to guide RL deployment in real-world environments, highlighting both its promise and the challenges that remain.
Submission Number: 15
Loading