Abstract: While Flow Matching models have achieved state-of-the-art performance, their reliance on deterministic, straight-path ODE sampling limits their capacity to explore the multi-modal nature of data distributions under linguistic constraints. For example, a prompt for a ``robot'' may encompass distinct semantic modes (\eg, ``red'' vs. ``yellow''), yet deterministic solvers often collapse into a single interpretation. This limitation is particularly restrictive in interactive scenarios where users desire to ``redraw'' specific regions—exploring diverse local alternatives while following the same prompt and global context constraints.
To bridge this gap, we propose Erasure-Redraw Sampling, a training-free framework that enables high-quality local semantic variations via a {zigzag (backward-and-forward)} sampling trajectory. Our method alternates between two phases: 1, Erasure: stochastic prompts are introduced during backward sampling to trigger mode-switching by effectively clearing existing local details. 2, Redraw: visual prompts serve a dual purpose—guiding the synthesis of new semantic details while enforcing spatial coherence during a forward pass.
Experimental results demonstrate that our method effectively balances global consistency with local multi-modality, offering a robust, plug-and-play solution for diverse generation.
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: Erasure-Redraw, Diverse Flow Matching
Submission Number: 32781
Loading