Single Reinforcement Learning Policy for Landing a Drone Under Different UGV Velocities and Trajectories

Published: 01 Jan 2023, Last Modified: 02 Mar 2025ICCMA 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We propose an algorithm that combines Reinforcement Learning (RL) and a PID cascade control for landing a drone on a moving unmanned ground vehicle (UGV). Unlike other works, where each policy is trained towards one specific task represented by one ground vehicle velocity, here we present an unified policy that enables the drone to land on UGV moving in multiple trajectories with different velocities. This improves the drone's capability and efficiency by reusing the parameters across different scenarios. More specifically, we consider the landing tasks by combining UGV platform linear velocities of 0.1 m/s, 0.2 m/s and 0.3 m/s with angular velocities -π / 4 rad/s, 0 and +π / 4 rad/s. In addition, the velocity set points sent to the platform are considered in the state space, enabling discrimination among situations and successful landing in different conditions with a single policy. The trained policy provides position offset commands to a cascade control, which, in turn, converts it into drone's motor thrusts enabling efficient maneuver. Further, the training occurs with parallel threads collecting experiences under different platform velocities and the actions are given in larger time window, compatible with real world applications where latency exists between sensor processing and data transmissions. Finally, the simulated experiments reveals convergence and demonstrate efficient landings for different conditions. The usage of an underlying cascade control exempts the policy from learning to stabilize the drone locally.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview