PD-FS:Surrogate-Enhanced Physical Data-Driven Framework for Rapid Deep Reinforcement Learning Control

RUIXIN ZHAN; Weiyuan Sun; Dongyue Huang; Shunxiang Cao

PD-FS:Surrogate-Enhanced Physical Data-Driven Framework for Rapid Deep Reinforcement Learning Control

RUIXIN ZHAN, Weiyuan Sun, Dongyue Huang, Shunxiang Cao

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep reinforcement learning, Bionic propulsion, Surrogate model, Physical Partition Network, Computational Fluid Dynamics

Abstract: While deep reinforcement learning (DRL) has demonstrated broad potential in se- quential decision-making, its application to fluid–dynamic systems remains lim- ited by the prohibitive cost of high-fidelity simulations and the difficulty of cap- turing multi-scale unsteady behaviors. In this work, we focus specifically on aquatic locomotion of fish-like robotic, where the control objective is to track spe- cific target point while maintaining energy efficiency within the constrained time. The agent observes low-dimensional kinematic states and flow-related signals, and outputs oscillation frequency commands that drive body undulation. These sensing–action constraints define a task that requires both accurate flow responses and fast, iterative learning. Motivated by these domain-specific requirements, we propose a task-oriented Physical Data-Driven Flow Simulation (PD-FS) frame- work—a staged pipeline that couples lightweight neural surrogates with physics- guided refinement in full-order CFD. PD-FS incorporates mode-conditioned sur- rogate models with cycle-locked and memory-aware updates, enabling sample- efficient training while faithfully reproducing critical frequency-switching dynam- ics. Rather than claiming general applicability, we position PD-FS as an engineer- ing integration tailored for fish swimming control under fluid–structure interac- tion. Policies refined in the CFD solvers adapt to nonlinear flow responses without relying on extensive domain randomization. In controlled fish-locomotion bench- marks, PD-FS achieves nearly 50 times faster training compared with CFD-only baselines, while reducing energy expenditure by over 20% at comparable success rates. These results highlight PD-FS as a domain-specific surrogate to CFD work- flow for efficient and physically consistent control of fish-like robotics.

Supplementary Material: pdf

Primary Area: learning on time series and dynamical systems

Submission Number: 24397

Loading