ORL: Reinforcement Learning Benchmarks for Online Stochastic Optimization Problems

Bharathan Balaji; Jordan Bell-Masterson; Enes Bilgin; Andreas Damianou; Pablo Moreno Garcia; Arpit Jain; Anna Luo; Alvaro Maggiar; Balakrishnan Narayanaswamy; Chun Ye

ORL: Reinforcement Learning Benchmarks for Online Stochastic Optimization Problems

Bharathan Balaji, Jordan Bell-Masterson, Enes Bilgin, Andreas Damianou, Pablo Moreno Garcia, Arpit Jain, Anna Luo, Alvaro Maggiar, Balakrishnan Narayanaswamy, Chun Ye

06 May 2019 (modified: 13 Apr 2025)RL4RealLife 2019Readers: Everyone

Keywords: reinforcement learning, benchmarks, operations research, online stochastic optimization, bin packing, vehicle routing, newsvendor

Abstract: Reinforcement Learning (RL) has recently achieved state-of-the-art performance in a wide variety of domains: from robotics, to gaming, to traffic control. The domain of Operations Research (OR) is particularly amenable to RL approaches, because many of the canonical problems can be characterized as online stochastic optimization problems where the distribution of data is unknown. While there is a nascent literature at the intersection of RL and OR, there are no commonly accepted benchmarks which can be used to compare proposed approaches rigorously in terms of performance, scale, or generalizability. This paper aims to fill that gap by introducing open source OR+RL benchmarks for three canonical OR problems with a wide range of practical applications: Bin Packing, Newsvendor, and Vehicle Routing. We apply both well-known OR approaches and newer RL algorithms to these problems and analyze results. For each of these problems, we find that RL is competitive with or superior to the OR baselines, pointing the way for future theoretical work and highlighting RL's immediate potential utility in a host of real-world problems.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/orl-reinforcement-learning-benchmarks-for/code)

0 Replies

Loading