Conditional Kernel Imitation Learning for Continuous State Environments

Rishabh Agrawal, Nathan Dahlin, Rahul Jain, Ashutosh Nayyar

Published: 04 Jun 2025, Last Modified: 23 Sept 2025Proceedings of the 7th Annual Learning for Dynamics & Control ConferenceEveryoneCC BY 4.0

Abstract: Imitation Learning (IL) is an important paradigm within the broader reinforcement learning (RL) methodology. Unlike most of RL, it does not assume availability of reward-feedback. Classical methods such as behavioral cloning and inverse reinforcement learning are highly sensitive to estimation errors, especially in continuous state space problems. Meanwhile, state-of-the-art (SOTA) IL algorithms often require additional online interaction data to be effective. In this paper, we consider the problem of imitation learning in continuous state space environments based solely on observed behavior, without access to transition dynamics information, reward structure, or, most importantly, any additional interactions with the environment. Our approach is based on the Markov balance equation and introduces a novel conditional kernel density estimation-based imitation learning framework. It uses conditional kernel density estimators for transition dynamics and seeks to satisfy a balance equation for the environment. We establish that our estimators satisfy asymptotic consistency and present associated sample complexity analysis. Through a series of numerical experiments on continuous state benchmark environments, we show consistently superior empirical performance over many SOTA IL algorithms. The full paper with the appendix is available at: https://github.com/rishabh-1086/CKIL.