Keywords: generative models, frequency transformation, image generation, ai for science, molecule assembly
Abstract: In this work, we propose K-Flow, a novel generative learning paradigm that flows along the $K$-amplitude domain, where $K$ is a scaling parameter that organizes projected coefficients (frequency bands), and amplitude refers to the norm of such coefficients. We instantiate K-Flow with three concrete $K$-amplitude transformations: Fourier transformation, Wavelet transformation, and PCA. By incorporating the $K$-amplitude transformations, K-Flow enables flow matching across the scaling parameter as time. We discuss six properties of K-Flow, covering its theoretical foundations, energy and temporal dynamics, and practical applications. Specifically, from the perspective of practical usage, K-Flow allows for steerable generation by controlling the information at different scales. To demonstrate the effectiveness of K-Flow, we conduct experiments on both unconditional and conditional image generation tasks, showing that K-Flow achieves competitive performance. Furthermore, we perform three ablation studies to illustrate how K-Flow leverages the scaling parameter for controlled image generation. Additional results, including scientific applications, are also provided.
Supplementary Material: pdf
Primary Area: generative models
Submission Number: 24049
Loading