Tri-Agent Driving: Learning to Coordinate Agents via Scenario Complexity Representation for Efficient Autonomous Driving

Xue Zhao; Qinying Gu; Xinbing Wang; Nanyang Ye

Tri-Agent Driving: Learning to Coordinate Agents via Scenario Complexity Representation for Efficient Autonomous Driving

Xue Zhao, Qinying Gu, Xinbing Wang, Nanyang Ye

03 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Scenario Complexity-Aware, Vision-Language Models, Autonomous Driving

Abstract: End-to-End (E2E) autonomous driving systems face a fundamental dilemma that fast traditional models offer low latency but struggle with complex and ambiguous scenarios, while Vision-Language Models based systems provide powerful contextual understanding at the cost of high computational overhead. Instead of pursuing a single faster or more powerful model, we present Tri-Agent Driving (TAD), a dynamic framework that learns to select the most appropriate agent on-the-fly based on scenario complexity, directly from raw multi-view camera inputs. This learned representation serves as a routing signal to enable real-time activation of the optimal agent, balancing computational efficiency and reasoning depth on demand. TAD integrates three complementary agents: a Fast Agent optimized for low-complexity and routine scenarios, a Smart Agent for medium-complexity scenes and a Deep Thinking Agent enhanced with Chain-of-Thought (CoT) reasoning for high-complexity corner cases. The core of TAD lies in the trainable Agent Coordination module, which proactively predicts scenario complexity and triggers agent switching without human intervention. On a challenging hybrid test set spanning diverse traffic conditions, TAD achieves state-of-the-art trajectory prediction, while reducing average inference latency by 26\% (4.2s vs. 5.7s) and GPU memory consumption by 30\% (15.4 GB vs. 22 GB) compared to the strongest VLM-based model. This ``fast when possible, deep when necessary” paradigm establishes a new standard for efficient, robust, and adaptive end-to-end autonomous driving.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 1209

Loading