Caravan: Asynchronous Test-Time Adaptation for Faster Inference

Published: 01 Mar 2026, Last Modified: 05 Apr 2026TTU at ICLR 2026 (Main)EveryoneRevisionsBibTeXCC BY 4.0
Abstract: Test-time adaptation (TTA) updates a model online during deployment to improve robustness to distribution shifts. While TTA updates give robustness, they take time: the update computation during inference makes deployment impractical for latency-sensitive systems. We present \textbf{Caravan}, an asynchronous TTA framework that decouples inference from update computation. Caravan maintains three concurrent streams that run on a \emph{single GPU}: a high-priority inference stream and two low-priority streams for computing updates. Because updates necessarily lag behind inference, Caravan revisits sample selection to only update the normalization-layer affine parameters and running statistics after (i) entropy filtering to retain reliable samples and (ii) gradient-consistency filtering of per-sample entropy gradients w.r.t. the last normalization layer to filter conflicting updates. Caravan improves latency by up to $6.8\times$ and accuracy by 1.99\% over synchronous TTA methods on ImageNet-C with ResNet50-BN.
Submission Number: 25
Loading