A Bi-Layer Joint Training Reinforcement Learning Framework for Post-Disaster Rescue

Songjun Huang, Chuanneng Sun, Jie Gong, Dario Pompili

Published: 01 Jan 2024, Last Modified: 05 Nov 2025DCOSS-IoT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In modern post-disaster rescue missions, the deployment of Multi-Robot Systems (MRSs) plays a key role in minimizing injuries and deaths among rescue personnel and civilians involved in the disaster. Achieving optimal MRS efficiency requires the implementation of a well-suited task allocation mechanism and a highly efficient pathfinding algorithm. However, due to inconsistent communication and low bandwidth, traditional frameworks in the mentioned domains may be impractical or may not work well. To address these problems, a novel Bi-Layer Joint Training Reinforcement (BJoT-RL) framework is proposed where, in the first layer, a Multi-Head Deep Q-Learning (MHDQN) is designed to perform task allocation; whereas, in the second layer, a Condition-Constrained Q-Learning (CCQ) is proposed to perform pathfinding. Noticeably, the output of each layer is used in the training of the other layer to realize a tight coupling, hence the innovative joint training. Thorough simulations show that the BJoT-RL framework performs better than state-of-the-art solutions in such applications.

External IDs:dblp:conf/dcoss/HuangSGP24