Learning Continuous 3-DoF Air-to-Air Close-in Combat Strategy using Proximal Policy Optimization

Luntong Li, Zhiming Zhou, Jiajun Chai, Zhen Liu, Yuanheng Zhu, Jianqiang Yi

Published: 01 Jan 2022, Last Modified: 10 May 2023CoG 2022Readers: Everyone

Abstract: Air-to-air close-in combat is based on many basic fighter maneuvers and can be largely modeled as an algorithmic function of inputs. This paper studies autonomous close-in combat, to learn new strategy that can adapt to different circumstances to fight against an opponent. Current methods for learning close-in combat strategy are largely limited to discrete action sets whether in the form of rules, actions or sub-polices. In contrast, we consider one-on-one air combat game with continuous action space and present a deep reinforcement learning method based on proximal policy optimization (PPO) that learns close-in combat strategy from observations in an end-to-end manner. The state space is designed to promote the learning efficiency of PPO. We also design a minimax strategy for the game. Simulation results show that the learned PPO agent is able to defeat the minimax opponent with about 97% win rate.

0 Replies