Diversified Flow Matching with Translation Identifiability

Sagar Shrestha; Xiao Fu

Diversified Flow Matching with Translation Identifiability

Sagar Shrestha, Xiao Fu

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Diversified distribution matching (DDM) finds a unified translation function mapping a diverse collection of conditional source distributions to their target counterparts. DDM was proposed to resolve content misalignment issues in unpaired domain translation, achieving translation identifiability. However, DDM has only been implemented using GANs due to its constraints on the translation function. GANs are often unstable to train and do not provide the transport trajectory information---yet such trajectories are useful in applications such as single-cell evolution analysis and robot route planning. This work introduces *diversified flow matching* (DFM), an ODE-based framework for DDM. Adapting flow matching (FM) to enforce a unified translation function as in DDM is challenging, as FM learns the translation function's velocity rather than the translation function itself. A custom bilevel optimization-based training loss, a nonlinear interpolant, and a structural reformulation are proposed to address these challenges, offering a tangible implementation. To our knowledge, DFM is the first ODE-based approach guaranteeing translation identifiability. Experiments on synthetic and real-world datasets validate the proposed method.

Lay Summary: We deal with the problem of translating data from one format or domain to another—like turning photos into cartoons or analyzing cellular evolution. Since it is hard to obtain paired data (e.g., photo and cartoon of the same person), existing methods, based on the recent generative model called flow matching (FM), did not allow explicit control over what gets translated to what, e.g., a photo of one person could be translated to a cartoon of a different person resulting in content misalignment. The challenge is that it is not clear how to "control" FM based generation. Our paper provides an in-depth analysis of the core issue, and proposes a novel approach to control the FM based generation and produce content aligned translation. Tests with synthetic and real-world examples show it reliably solves previous issues, making it practical and effective for applications ranging from image editing to swarm robot navigation.

Primary Area: General Machine Learning->Unsupervised and Semi-supervised Learning

Keywords: unsupervised domain translation, flow matching, GAN, identifiability

Submission Number: 5394

Loading