Keywords: Flow matching
Abstract: Discrete diffusion and flow matching excel at capturing epistatic structure in protein fitness landscapes through parallel, iterative refinement. However, their implicit nature—sampling via learned dynamics without tractable densities—prevents direct use with principled variational frameworks like VSD and CbAS for budget-constrained design. We introduce \emph{Active Flow Matching (AFM)}, which reformulates variational objectives to operate on conditional endpoint distributions along the flow rather than requiring $\log q_\phi(x)$. This enables gradient-based steering of flow models toward high-fitness regions while preserving the rigor of VSD and CbAS. We derive forward-KL and reverse-KL variants using self-normalised importance sampling. Across four protein design tasks forward-KL AFM consistently achieves lower regret and higher optimization performance than VSD and diffusion-based LaMBO-2, demonstrating effective exploration-exploitation under tight experimental budgets.
Submission Number: 6
Loading