Parallel TD3 for Policy Gradient-Based Multi-condition Multi-objective Optimisation

Dasun Shalila Balasooriya, Alan Blair, Ben Wilks, Craig A. Wheeler, Tahir Jauhar, Stephan K. Chalup

Published: 2025, Last Modified: 11 Jun 2025EMO (2) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We propose a novel optimisation algorithm designed for multi-condition multi-objective problems, utilising a single-step Twin Delayed Deep Deterministic Policy Gradient (TD3) framework with parallel function evaluations. The proposed algorithm’s performance was tested on two mathematical test problems and an airfoil shape optimisation scenario. In all cases, our approach outperformed both a previously proposed deep reinforcement learning algorithm and the well-known NSGA-II genetic algorithm by discovering a more comprehensive Pareto front at a lower number of function evaluations. Additionally, the proposed algorithm demonstrated more stable training of the deep neural networks across all problems.