Shaping Robotic Actions with Fourier Flow Matching

Shaping Robotic Actions with Fourier Flow Matching

ICLR 2026 Conference Submission20523 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Vision-Language-Action Models, Generalist Policies, Robotic Manipulation, Robotics

TL;DR: Fourier-based flow-matching method for Vision-Language-Action (VLA) policies.

Abstract: We present a Fourier-based flow-matching method for Vision-Language-Action (VLA) policies that lets the policy reason over smooth trajectories, rather than stepwise actions. Instead of training on raw joint- or Cartesian-space action sequences, we project each sequence into a compact Discrete Cosine Transform (DCT) basis and learn directly in coefficient space via flow matching. This trajectory-level representation enforces smoothness and reduces dimensionality. Importantly, we show that the DCT representation integrates with asynchronous plan-execute schemes, preserving policy responsiveness. In experiments, predicting DCT coefficients yields higher task success than classical flow matching VLA baselines trained on per-step actions. Our results indicate that Fourier-domain flow matching is a simple, drop-in alternative that improves the performance and stability of VLA policies.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 20523

Loading