Inverse GFlowNets for Generative Imitation Learning

Hohyun Kim; Hyesung Kim; Yein Hwang; Min-hwan Oh; Seunggeun Lee

Inverse GFlowNets for Generative Imitation Learning

Hohyun Kim, Hyesung Kim, Yein Hwang, Min-hwan Oh, Seunggeun Lee

20 Sept 2025 (modified: 02 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: generative models, imitation learning, gflownets, reinforcement learning

Abstract: Sequential generative models are typically trained by maximizing the evidence lower bound (ELBO), which optimizes the likelihood of predicting the next observation given the current one. While ELBO-based training is simple and scalable, in sequential settings it suffers from compounding errors. In this work, we reinterpret ELBO training as an imitation learning problem for modeling data distributions. We show that prior formulations suffer from an entropy bias that is misaligned with the objectives of generative modeling. To address this issue, we leverage the GFlowNet framework to eliminate the bias and derive algorithms that can be viewed as regularized ELBO objectives. Our approach assigns positive rewards to data samples and negative rewards to policy-generated samples, corresponding to minimization of the $\chi^2$-divergence between the data distribution and the policy mixture. We further establish theoretical connections to existing imitation learning methods, providing transferable insights across domains. Empirically, our approach eliminates entropy bias and achieves improved performance on a range of generative modeling tasks by combining with previous methods.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 23960

Loading