Abstract: This paper evaluates four Reinforcement Learning (RL) algorithms, namely, Proximal Policy Optimization (PPO), Policy Gradient (PG), Advantage Actor-Critic (A2C), and Asynchronous Advantage Actor-Critic (A3C), for solving the Job Shop Scheduling Problem (JSSP) using Lawrence, Dermikol, and Taillard datasets. Experiments show that PPO consistently outperforms traditional dispatching rules and state-of-the-art methods, achieving 6–9 times lower optimality gaps than traditional algorithms and 2–3 times lower than state-of-the-art approaches across all datasets. These results demonstrate the potential of RL, particularly PPO, in enhancing scheduling optimization for the JSSP.
External IDs:dblp:conf/sgai/MaharjanAJ24
Loading