Adaptive multi-model fusion learning for sparse-reward reinforcement learning

Giseung Park, Whiyoung Jung, Seungyul Han, Sungho Choi, Youngchul Sung

Published: 01 Jan 2025, Last Modified: 13 May 2025Neurocomputing 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this paper, we address intrinsic reward generation for sparse-reward reinforcement learning, where the agent receives limited extrinsic feedback from the environment. Traditional approaches to intrinsic reward generation often rely on prediction errors from a single model, where the intrinsic reward is derived from the discrepancy between the model’s predicted outputs and the actual targets. This approach exploits the observation that less-visited state–action pairs typically yield higher prediction errors. We extend this framework by incorporating multiple prediction models and propose an adaptive fusion technique specifically designed for the multi-model setting. We establish and mathematically justify key axiomatic conditions that any viable fusion method must satisfy. Our adaptive fusion approach dynamically learns the best way to combine prediction errors during training, leading to improved learning performance. Numerical experiments validate the effectiveness of our method, showing significant performance gains across various tasks compared to existing approaches.