Block-wise Adaptive Caching for Accelerating Diffusion Policy

Kangye Ji; Yuan Meng; Hanyun Cui; Ye Li; Jianbo Zhou; Shengjia Hua; Lei Chen; Zhi Wang

Block-wise Adaptive Caching for Accelerating Diffusion Policy

Kangye Ji, Yuan Meng, Hanyun Cui, Ye Li, Jianbo Zhou, Shengjia Hua, Lei Chen, Zhi Wang

Published: 26 Jan 2026, Last Modified: 16 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Efficient AI, Diffusion Policy, Visuomotor Policy, Robotics, Action Generation, Model Caching.

TL;DR: We introduce Block-wise Adaptive Caching, an efficient training-free caching plugin to accelerate Diffusion Policy for triple times.

Abstract: Diffusion Policy has demonstrated strong visuomotor modeling capabilities, but its high computational cost renders it impractical for real-time robotic control. Despite huge redundancy across repetitive denoising steps, existing diffusion acceleration techniques fail to generalize to Diffusion Policy due to fundamental architectural and data divergences. In this paper, we propose **B**lock-wise **A**daptive **C**aching (**BAC**), a method to accelerate Diffusion Policy by caching intermediate action features. BAC achieves lossless action generation acceleration by adaptively updating and reusing cached features at the block level, based on a key observation that feature similarities exhibit non-uniform temporal dynamics and distinct block-specific patterns. To operationalize this insight, we first design an Adaptive Caching Scheduler to identify optimal update timesteps by maximizing the global feature similarities between cached and skipped features. However, applying this scheduler for each block leads to significant error surges due to the inter-block propagation of caching errors, particularly within Feed-Forward Network (FFN) blocks. To mitigate this issue, we develop the Bubbling Union Algorithm, which truncates these errors by updating the upstream blocks with significant caching errors before downstream FFNs. As a training-free plugin, BAC is readily integrable with existing transformer-based Diffusion Policy and vision-language-action models. Extensive experiments on multiple robotic benchmarks demonstrate that BAC achieves up to 3x inference speedup for free. Project page: https://block-wise-adaptive-caching.github.io.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 332

Loading