RAISE the Bar: Ensemble-based Online Reinforcement Learning for Dynamic Workflow Scheduling

Yifan Yang; Gang Chen; Hui Ma; Zhiguang Cao; Mengjie Zhang; Yew-Soon Ong

RAISE the Bar: Ensemble-based Online Reinforcement Learning for Dynamic Workflow Scheduling

Yifan Yang, Gang Chen, Hui Ma, Zhiguang Cao, Mengjie Zhang, Yew-Soon Ong

11 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Workflow Scheduling; Reinforcement Learning; Online Learning

TL;DR: We propose RAISE, an online RL method with ensemble actors and dual critics that delivers stable and adaptive scheduling under non-stationary workloads.

Abstract: Dynamic workflow scheduling (DWS) in cloud computing poses major challenges due to unpredictable workflow arrivals, heterogeneous resources, and evolving system states. While reinforcement learning (RL) has shown promise for learning adaptive scheduling policies, existing single-policy approaches often struggle in online settings with non-stationary workloads. We propose **RAISE** (*Robust Actor-Critic Integration for Scheduling Ensembles*), an ensemble-based online RL approach designed to improve adaptability and stability in dynamic environments. RAISE maintains a set of pre-trained actors and critics that are continually updated during deployment to support robust scheduling. It integrates three key components: (1) *Value-Ranked Action Aggregation*, which combines majority voting with critic-guided tie-breaking for stable action selection; (2) *Dual Critic Ensembles with Decoupled Updates*, which balance fast adaptation and stable value estimates; and (3) *Decision-Aligned Policy Updates*, which enhance sample efficiency by updating only the actors responsible for chosen actions. Experiments on large-scale DWS benchmarks show that RAISE consistently outperforms state-of-the-art baselines in both performance and robustness, demonstrating the effectiveness of ensemble-based online RL for real-time scheduling under non-stationary conditions.

Primary Area: optimization

Submission Number: 3965

Loading