GAN-MPC: Training Model Predictive Controllers with Parameterized Cost Functions using Demonstrations from Non-identical Experts

Returaj Burnwal; Anirban Santara; Nirav Pravinbhai Bhatt; Balaraman Ravindran; Gaurav Aggarwal

GAN-MPC: Training Model Predictive Controllers with Parameterized Cost Functions using Demonstrations from Non-identical Experts

Returaj Burnwal, Anirban Santara, Nirav Pravinbhai Bhatt, Balaraman Ravindran, Gaurav Aggarwal

Published: 01 Jun 2023, Last Modified: 09 Jun 2023DAI2023 OralPresentationReaders: Everyone

Abstract: Model predictive control (MPC) is a popular approach for trajectory optimization in practical robotics applications due to their guarantees on safety, optimality, generalizability, interpretability, and explainability. Traditional MPC needs a hand-crafted cost function for trajectory optimization. However, some behaviors are complex and hand-crafting is difficult and error-prone. A special class of MPC policies called Learnable-MPC addresses this difficulty using imitation learning from expert demonstrations. However, they require the demonstrator and the imitator agents to have identical state-action spaces and transition dynamics which is hard to satisfy in many practical applications of robotics. In this paper, we address this practical problem through a novel approach that uses a generative adversarial network (GAN) to match state-trajectory distributions of the demonstrator and the imitator. We evaluate our approach on a variety of simulated robotics tasks of DeepMind Control suite and demonstrate the efficacy of our approach at learning the demonstrator's behavior without having to copy their actions.

Article: pdf

2 Replies

Loading