Augmenting the Robustness of Tactical Maneuver Decision-Making in Unmanned Aerial Combat Vehicles During Dogfights via Prioritized Population Play With Diversified Partners
Abstract: This article introduces the prioritized population play with diversified partners (P3DPs) to augment the robustness of tactical maneuver decision-making in unmanned aerial combat vehicles during dogfights when facing a spectrum of opponent strategies. Population play where multiple agents exist during the training phase is utilized to facilitate the initialization. Prioritized opponent sampling, favoring agents with higher historical win rates, ensures that the training process focuses on more competitive and challenging scenarios, therefore augmenting the quality of data. Furthermore, a diversity-guided approach is implemented, where each agent maintains individual policy entropy while collectively optimizing the population entropy (PE). This method augments the diversity within the population, preventing policies from becoming identical. In addition, a partner mechanism is introduced, where a primary agent disregards the PE to protect the optimality of its policy from diversity consideration. Meanwhile, other agents act as partners, supplying various opponent policies to increase data diversity. Experiments validate that P3DP surpasses other training methodologies, enormously augmenting policy robustness.
External IDs:dblp:journals/taes/HanCLD25
Loading