Intelligent Decision-Making Method for AUV Path Planning Against Ocean Current Disturbance via Reinforcement Learning

Jiabao Wen; Huiao Dai; Jingyi He; Lijiao Sun; Liqing Gao

Intelligent Decision-Making Method for AUV Path Planning Against Ocean Current Disturbance via Reinforcement Learning

Jiabao Wen, Huiao Dai, Jingyi He, Lijiao Sun, Liqing Gao

Published: 01 Jan 2024, Last Modified: 13 May 2025IEEE Internet Things J. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the development of society and the economy, low-carbon and low-energy means of exploiting marine resources are receiving increasing attention. Autonomous path planning is a fundamental capability for IoT autonomous underwater vehicle (AUV) to carry out ocean exploration tasks. Currently, the main issue lies in the numerous disturbances and uncertainties present in the marine environment during practical applications, which can significantly impact path planning, leading to high-energy consumption and carbon emissions. To address this challenge, this article presents a sustainable reinforcement learning algorithm for handling time-varying current disturbances to achieve low-carbon AUV path planning, which is delineated into three steps. First, a 3-D time-varying current environment is established as the environmental framework for reinforcement learning, and the dynamic model of the AUV is formulated. Second, to enhance training efficiency and reduce AUV’s energy consumption, this article puts forth the ocean current disturbance rejection PPO (OCDRP) algorithm, which incorporates tidal current information to enhance the AUV’s resilience to time-varying currents. Lastly, expectile regression methods are introduced to facilitate the algorithm’s convergence. Experimental results confirm the efficacy of the proposed algorithm and its adaptability to time-varying currents, making it an efficient, adaptable, and low-carbon sustainable path planning approach.

Loading