Keywords: flow models, diffusion models, reward-guided fine-tuning, control-based fine-tuning
TL;DR: We unify reward-guided fine-tuning and flow merging via a probability-space optimization viewpoint, which allows to perform reward-guided flow merging.
Abstract: Recent progress in large-scale flow and diffusion models raised two fundamental algorithmic challenges: $(i)$ control-based reward adaptation of pre-trained flows, and $(ii)$ integration of multiple models, i.e., flow merging. While current approaches address them separately, we introduce a unifying probability-space framework that subsumes both as limit cases, and enables *reward-guided flow merging*, allowing principled, task-aware combination of multiple pre-trained flows (e.g., merging priors while maximizing drug-discovery utilities). Our formulation renders possible to express a rich family of *operators over generative models densities*, including intersection (e.g., to enforce safety), union (e.g., to compose diverse models), interpolation (e.g., for discovery), their reward-guided counterparts, as well as complex logical expressions via *generative circuits*. Next, we introduce Reward-Guided Flow Merging (RFM), a mirror-descent scheme that reduces reward-guided flow merging to a sequence of standard fine-tuning problems. Then, we provide first-of-their-kind theoretical guarantees for reward-guided and *pure* flow merging via RFM. Ultimately, we showcase the capabilities of the proposed method on illustrative settings providing visually interpretable insights, and apply our method to high-dimensional de-novo molecular design.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 12
Loading