Compositional Flow Matching with Factored Velocity Fields

Published: 26 May 2026, Last Modified: 26 May 2026ICML 2026 FoGen Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: deep generative modelling, flow matching, compositional generalization
Abstract: Conditional generative models can have difficulty generating attribute combinations absent from training, even when each individual factor is densely covered, otherwise known as a failure to compositionally generalize. We propose a factored conditional flow matching architecture that uses a shared base velocity augmented by per-factor heads, summed at the bottleneck. We show that on the Shapes3D and MPI3D-real datasets, the factored architecture matches or beats a parameter-matched monolithic baseline under three structured zero-shot holdout strengths over a two-attribute lattice, notably lowering heldout FID by $\sim 2.4\times$ on the $50\%$ and $75\%$ patterns on Shapes3D. Next, we conduct a slice-attack ablation using per-factor classifier-free composition but show that the factored architecture remains strictly better on both metrics, confirming the gain is structural rather than a consequence of a weak baseline model. Finally we show that the per-head construction also enables a $K \to K{+}1$ modular extension where a new factor head can be added to a frozen $K$-factor stack and trained alone, producing a working $(K{+}1)$-factor model without retraining the base model or any existing head.
Submission Number: 164
Loading