REPA-FPO: A Fisher Policy Optimization for Efficient Flow Matching Training

REPA-FPO: A Fisher Policy Optimization for Efficient Flow Matching Training

01 May 2026 (modified: 08 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Flow Matching (FM) models are a leading class of generative models, widely used across diverse domains. However, FM models require large-scale training datasets, which makes training computationally expensive. Existing feature alignment (REPA) improves training efficiency but overlooks the role of the data itself, leaving further room for improvement. In this paper, we observe that different samples carry different amounts of Fisher information and thus contribute unequally to parameter learning in FM. This heterogeneity highlights the importance of accounting for sample-wise contributions during training. However, computing per-sample Fisher information accurately is prohibitively expensive in practice. To overcome this limitation, we provide a mathematical analysis showing that the loss magnitude can serve as an effective proxy for the trace of the Fisher Information Matrix (FIM), enabling efficient estimation. Building on this insight, we propose Fisher Policy Optimization (FPO), a strategy that dynamically reweights samples during training by shifting weight from low-FIM samples to high-FIM samples. Extensive experiments demonstrate that FPO improves both training efficiency and generation quality, while generalizing well across inference samplers, model architectures, and diffusion spaces.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Liang-Chieh_Chen1

Submission Number: 8703

Loading