RLEM: Deep Reinforcement Learning Ensemble Method for Aircraft Recovery Problem

Dominik Zurek, Marcin Pietron, Szymon Piórkowski, Michal Karwatowski, Kamil Faber

Published: 01 Jan 2024, Last Modified: 21 Jul 2025IEEE Big Data 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Efficient flight scheduling is crucial to properly allocate airline resources, but even the best flight schedule has to face unexpected delays and disruptions. The ability to recover from such disruptions is essential for airlines to minimize the negative impact on their revenue and reputation. In this context, machine learning-based methods can be used to identify suitable recovery methods as unexpected events occur. Reinforcement learning approaches are especially promising since they extract suitable solutions much more efficiently than conventional optimization and meta-heuristics methods and provide timely rescheduling capabilities for airlines, which translates into reduced capital and reputation losses. However, current works either do not leverage deep learning or focus on simple scenarios that do not fully entail real-world complexities, resulting in limited efficiency or sub-optimal solutions. In this paper, we propose an ensemble of two deep learning approaches: Deep Double Q-Learning (DDQL) and Advantage Actor-Critic (A2C). The models aim to minimize the total delays caused by disruptions by swapping aircraft and delaying flights as recovery options. We perform experiments with a benchmark dataset and a real-world airline dataset, showing that our method is effective in providing a significant reduction of delays caused by disruptions.