PD-FS:Surrogate-Enhanced Physical Data-Driven Framework for Rapid Deep Reinforcement Learning Control
Keywords: Deep reinforcement learning, Bionic propulsion, Surrogate model, Physical Partition Network, Computational Fluid Dynamics
Abstract: While deep reinforcement learning (DRL) has demonstrated broad potential in se-
quential decision-making, its application to fluid–dynamic systems remains lim-
ited by the prohibitive cost of high-fidelity simulations and the difficulty of cap-
turing multi-scale unsteady behaviors. In this work, we focus specifically on
aquatic locomotion of fish-like robotic, where the control objective is to track spe-
cific target point while maintaining energy efficiency within the constrained time.
The agent observes low-dimensional kinematic states and flow-related signals,
and outputs oscillation frequency commands that drive body undulation. These
sensing–action constraints define a task that requires both accurate flow responses
and fast, iterative learning. Motivated by these domain-specific requirements, we
propose a task-oriented Physical Data-Driven Flow Simulation (PD-FS) frame-
work—a staged pipeline that couples lightweight neural surrogates with physics-
guided refinement in full-order CFD. PD-FS incorporates mode-conditioned sur-
rogate models with cycle-locked and memory-aware updates, enabling sample-
efficient training while faithfully reproducing critical frequency-switching dynam-
ics. Rather than claiming general applicability, we position PD-FS as an engineer-
ing integration tailored for fish swimming control under fluid–structure interac-
tion. Policies refined in the CFD solvers adapt to nonlinear flow responses without
relying on extensive domain randomization. In controlled fish-locomotion bench-
marks, PD-FS achieves nearly 50 times faster training compared with CFD-only
baselines, while reducing energy expenditure by over 20% at comparable success
rates. These results highlight PD-FS as a domain-specific surrogate to CFD work-
flow for efficient and physically consistent control of fish-like robotics.
Supplementary Material: pdf
Primary Area: learning on time series and dynamical systems
Submission Number: 24397
Loading