Cascade - A sequential ensemble method for continuous control tasks

Published: 09 May 2025, Last Modified: 09 May 2025RLC 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Ensemble learning, reinforcement learning, continuous control, PPO
TL;DR: A novel ensemble reinforcement learning algorithm named Cascade that uses a convex combination of its base policies.
Abstract: Though reinforcement learning has been successfully applied to a variety of domains, there is still room left for improvement, in particular, in terms of the final performance. Ensemble Reinforcement Learning (ERL) tries to enhance reinforcement learning techniques by using multiple models or algorithms. We propose a novel ERL technique, called Cascade which in the context of continuous control tasks and with PPO as the base training algorithm clearly outperforms standard PPO in terms of the final performance. To shine light on the working mechanisms of Cascade, we conduct ablation studies, showing how the different components of Cascade contribute to its overall performance. Furthermore, we demonstrate that Cascade has a robust monotonicity as the ensemble’s performance increases with each additional base agent even when weak base agents are added in large numbers.
Submission Number: 32
Loading