Abstract: We provide an algorithm for the simultaneous system identification and model predictive control of nonlinear
systems. The algorithm has finite-time near-optimality guarantees and asymptotically converges to the optimal (non-causal)
controller. Particularly, the algorithm enjoys sublinear dynamic
regret, defined herein as the suboptimality against an optimal
clairvoyant controller that knows how the unknown disturbances
and system dynamics will adapt to its actions. The algorithm
is self-supervised and applies to control-affine systems with
unknown dynamics and disturbances that can be expressed in
reproducing kernel Hilbert spaces [1]. Such spaces can model
external disturbances and modeling errors that can even be
adaptive to the system’s state and control input. For example,
they can model wind and wave disturbances to aerial and
marine vehicles, or inaccurate model parameters such as inertia
of mechanical systems. We are motivated by the future of
autonomy where robots will autonomously perform complex tasks
despite real-world unknown disturbances such as wind gusts.
The algorithm first generates random Fourier features that are
used to approximate the unknown dynamics or disturbances.
Then, it employs model predictive control based on the current
learned model of the unknown dynamics (or disturbances). The
model of the unknown dynamics is updated online using least
squares based on the data collected while controlling the system.
We validate our algorithm in both hardware experiments and
physics-based simulations. The simulations include (i) a cartpole aiming to maintain the pole upright despite inaccurate
model parameters, and (ii) a quadrotor aiming to track reference
trajectories despite unmodeled aerodynamic drag effects. The
hardware experiments include a quadrotor aiming to track a
circular trajectory despite unmodeled aerodynamic drag effects,
ground effects, and wind disturbances.
Loading