Overcoming Knowledge Barriers: Online Imitation Learning from Observation with Pretrained World Models

Published: 01 Jul 2024, Last Modified: 24 Jul 2024CVG PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: World Models, Pretraining, Imitation Learning
TL;DR: We identify two knowledge barriers for pretrained-world-model-based Imitation Learning from Observation and propose AIME-NoB to overcome these barriers.
Abstract: Pretraining and finetuning models has become increasingly popular. But there are still serious impediments in Imitation Learning from Observation (ILfO) with pretrained models. This study identifies two primary obstacles: the Embodiment Knowledge Barrier (EKB) and the Demonstration Knowledge Barrier (DKB). The EKB emerges due to the pretrained models' limitations in handling novel observations, which leads to inaccurate action inference. Conversely, the DKB stems from the reliance on limited demonstration datasets, restricting the model's adaptability across diverse scenarios. We propose separate solutions to overcome each barrier and apply them to Action Inference by Maximising Evidence (AIME), a state-of-the-art algorithm. This new algorithm, AIME-NoB, integrates online interactions and a data-driven regulariser to mitigate the EKB. Additionally, it uses a surrogate reward function to broaden the policy's applicability, addressing the DKB. Our experiments on tasks from the DeepMind Control Suite and Meta-World benchmarks show that AIME-NoB significantly enhances sample efficiency and performance, presenting a robust framework for overcoming the challenges in ILfO with pretrained models.
Submission Number: 2
Loading