Learning Factorized Diffusion Policies for Conditional Action Diffusion

Published: 18 Apr 2025, Last Modified: 16 May 2025ICRA 2025 FMNS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: diffusion, robotics, bayes, residual, learning
TL;DR: We present a novel compositional method to train diffusion models by decoupling different observational modalities such as proprioception, vision and tactile.
Abstract: Diffusion models have emerged as a promising choice for learning robot skills from demonstrations. However, they face three problems: diffusion models are not sample-efficient, data is expensive to collect in robotics, and the space of tasks is combinatorially large. The established way of learning diffusion policies has little room to accommodate solutions for the aforementioned challenges, in addition to scaling the model size and paired observation-action data. In this work, we propose a novel method for training diffusion models termed ‘Composable Diffusion Guidance’ CoDiG to compositionally learn diffusion policies for robot skills with respect to different observation modalities, such as proprioception, vision and tactile. CoDiG provides more flexibility to deal with such observational modalities, leading to sample-efficiency gains over 20% in applicable tasks. CoDiG opens up more avenues for research on foundation models as it ameliorates the requirement of scaling all the observational modalities together during data collection.
Supplementary Material: pdf
Submission Number: 28
Loading