Adviser-Actor-Critic: Reducing Steady-State Error in Reinforcement Learning for Robotics Control

ICLR 2026 Conference Submission17048 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: reinforcement learning, robotics, control system
TL;DR: Adviser-Actor-Critic (AAC) combines reinforcement learning with a novel adviser to generate virtual goals, effectively reducing steady-state errors by over 80% in high-precision robotic control tasks.
Abstract: High-precision control tasks present substantial challenges for reinforcement learning (RL) algorithms, frequently resulting in suboptimal performance attributed to network approximation inaccuracies and inadequate sample quality. While existing RL frameworks can achieve task completion at coarse precision levels, steady-state tracking errors remain a critical limitation that prevents achieving sub-hardware-level precision. We introduce Adviser-Actor-Critic (AAC), designed to address this precision control dilemma by combining the precision of feedback control theory with the adaptive learning capability of RL and featuring an Adviser that mentors the actor to refine control actions, thereby enhancing the precision of goal attainment. Through extensive benchmark environments from gymnasium-robotics, coupled with real-world quadcopter attitude control, AAC significantly outperforms standard RL algorithms in precision-critical tasks while demonstrating an average $>80\%$ steady-state error reduction compared to baseline methods.
Primary Area: reinforcement learning
Submission Number: 17048
Loading