IL-SOAR: Imitation Learning with soft optimistic actor critic

Published: 17 Jul 2025, Last Modified: 06 Sept 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: imitation, practical exploration, inverse RL
Abstract: This paper introduces the SOAR framework for imitation learning. SOAR is an algorithmic template that learns a policy from expert demonstrations with a primal dual style algorithm that alternates cost and policy updates. Within the policy updates, the SOAR framework uses an actor critic method with multiple critics to estimate the critic uncertainty and build an optimistic critic fundamental to drive exploration. When instantiated in the tabular setting, we get a provable algorithm with guarantees that matches the best known results in the desired accuracy parameter $\epsilon$. Practically, the SOAR template can boost the performance of \emph{any} imitation learning algorithm based on Soft Actror Critic (SAC). As an example, we show that SOAR can boost consistently the performance of the following SAC-based imitation learning algorithms: $f$-IRL, ML-IRL and CSIL. Overall, thanks to SOAR, the required number of episodes to achieve the same performance is reduced by half.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Luca_Viano1
Track: Fast Track: published work
Publication Link: https://icml.cc/virtual/2025/poster/45502
Submission Number: 26
Loading