Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning

Published: 01 Jan 2017, Last Modified: 13 Nov 2024GlobalSIP 2017EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Unmanned Aerial Vehicles (UAVs) based delivery is thriving. In this paper, we model autonomous navigation of UAV in large-scale unknown complex environment as a discrete-time continuous control problem and solve it using deep reinforcement learning. Without path planning or map construction, our method enables UAVs to navigate from arbitrary departure places to destinations using only sensory information of local environment and GPS signal. We argue the navigation task is a partially observable Markov decision process (POMDP) and extant recurrent deterministic policy gradient algorithm is less efficient. Consequently, we derive a faster policy learning algorithm for POMDP based on actor-critic architecture. To validate our ideas, we simulate five virtual environments and a virtual UAV flying at a fixed altitude with constant speed. Cognition of local environment is achieved by measuring distances from UAV to obstacles in multiple directions. Simulation results demonstrate the effectiveness of our method.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview