Mini Diffuser: Accelerating Diffusion Policy Optimization via Two-Level Minibatching

Published: 08 May 2026, Last Modified: 08 May 2026ICRA 2026 Workshop RL4IL OralEveryoneRevisionsCC BY 4.0
Keywords: diffusion model, diffusion policy, manipulation, imitation learning, robot learning
TL;DR: Accelerating Diffusion Policy training across both IL and RL gradient step, via costless L2 batching to bypass redundant vision backbone passes.
Abstract: Diffusion policies has been established as a dominant paradigm for robotic Imitation learning (IL). However, the high computational cost of training these models remains a bottleneck for scaling to multi-task regimes and performing efficient online adaptation via reinforcement learning (RL). We present Mini-Diffuser, a method that reduces the time and memory required to train vision-language robotic diffusion policies by an order of magnitude. Our approach exploits a fundamental asymmetry in action diffusion: while image diffusion targets high-dimensional outputs, action generation targets a comparatively low-dimensional space where only the visual condition is high-dimensional. By introducing two-level minibatching, Mini-Diffuser pairs multiple noised action samples with a single vision-language condition. This decoupling allows for significantly more efficient gradient steps during both supervised imitation learning and iterative RL optimization. In PushT Benchmark environment, a Level-2 batch equipped Diffusion Policy can reach the same performance using 60\% less training time. In RLBench simulations, Mini-Diffuser achieves 95\% of the performance of state-of-the-art multi-task policies while using only 5\% of the training time and 7\% of the memory. By drastically lowering the resource requirements for diffusion-based gradients, Mini-Diffuser provides a practical path for the pretraining, continuous fine-tuning, and real-world adaptation of foundation models at the intersection of IL and RL.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 17
Loading