ActionFlow: Equivariant, Accurate, and Efficient Manipulation Policies with Flow Matching

Published: 29 Oct 2024, Last Modified: 03 Nov 2024CoRL 2024 Workshop MRM-D PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion and energy-based policies for robot manipulation, Learning from Demonstrations, Behavioral Cloning, Deep Generative Models, Flow Matching, Robot Manipulation
TL;DR: A novel policy class combining Flow Matching with SE(3) Invariant Transformers for efficient, equivariant, and expressive robot learning from demonstrations.
Abstract: Spatial understanding is a critical aspect of most robotic tasks, particularly when generalization is important. Despite the impressive results of deep generative models in complex manipulation tasks, the absence of a representation that encodes intricate spatial relationships between observations and actions often limits spatial generalization, necessitating large amounts of demonstrations. To tackle this problem, we introduce a novel policy class, ActionFlow. ActionFlow integrates spatial symmetry inductive biases while generating expressive action sequences. On the representation level, ActionFlow introduces an SE(3) Invariant Transformer architecture, which enables informed spatial reasoning based on the relative SE(3) poses between observations and actions. For action generation, ActionFlow leverages Flow Matching, a state-of-the-art deep generative model known for generating high-quality samples with fast inference -- an essential property for feedback control. In combination, ActionFlow policies exhibit strong spatial and locality biases and SE(3)-equivariant action generation. Our experiments demonstrate the effectiveness of ActionFlow and its two main components on simulated and real-world robotic manipulation tasks and confirm that ActionFlow yields equivariant, accurate, and efficient policies. Project website: https://flowbasedpolicies.github.io/
Submission Number: 21
Loading