Accelerating Structured Chain-of-Thought in Autonomous Vehicles

ICLR 2026 Conference Submission14508 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: autonomous vehicle, large language model, chain of thought
TL;DR: We proposes FastDriveCoT, a novel parallel decoding method that achieves 3-4 times speed up for Chain-of-Thought reasoning in autonomous vehicles.
Abstract: Chain-of-Thought (CoT) reasoning enhances the decision-making capabilities of vision-language-action models in autonomous driving, but its autoregressive nature introduces significant inference latency, making it impractical for real-time applications. To address this, we introduce FastDriveCoT, a novel parallel decoding method that accelerates template-structured CoT. Our approach decomposes the reasoning process into a dependency graph of distinct sub-tasks, such as identifying critical objects and summarizing traffic rules, some of which can be generated in parallel. By generating multiple independent reasoning steps concurrently within a single forward pass, we significantly reduce the number of sequential computations. Experiments demonstrate a 3-4$\times$ speedup in CoT generation and a substantial reduction in end-to-end latency across various model architectures, all while preserving the original downstream task improvements brought by incorporating CoT reasoning.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 14508
Loading