Abstract: This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities. The overall learning system is stochastic, due to the random nature of the policy gradient updates, thus we combine analysis techniques commonly employed in the machine learning literature alongside stability arguments from adaptive control to demonstrate that with high probability the tracking and parameter errors concentrate near zero, under a standard persistency of excitation condition. A simulated example of a double pendulum demonstrates the utility of the proposed theory.
0 Replies
Loading