- Keywords: reinforcement learning, contiuous control, recurrent neural networks, policy gradient, bioprocess, optimisation
- TL;DR: Constrained continuous control of bioprocesses via policy gradients and recurrent neural networks.
- Abstract: Bioprocesses have recently received attention to produce clean and sustainable alternatives to fossil-based materials. However, they are generally difficult to optimize due to their unsteady-state operation modes and stochastic behaviours. Furthermore, biological systems are highly complex, therefore plant-model mismatch is often present. In this work we leverage a model-free Reinforcement Learning optimisation strategy. We apply the Policy Gradient method to tune a control policy parametrized by a recurrent neural network. We assume that a preliminary model of the process is available, which is exploited to obtain an initial optimal control policy. Subsequently, this policy is updated based on a variation of the starting model, with adequate disturbance, to simulate the plan-model mismatch.