City-Adaptive Testing of Autonomous Driving with Traffic Prediction and Scenario Fuzzing

17 Sept 2025 (modified: 26 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Autonomous Driving Systems, Scenario-Based Testing, Simulation and Robustness Evaluation, Traffic Flow Prediction, Behavior Modeling
TL;DR: We propose a city-adaptive ADS testing framework combining traffic prediction, POP motorcycle modeling, and scenario fuzzing, generating realistic urban scenarios that expose long-tail risks and enhance ADS robustness.
Abstract: Autonomous Driving Systems (ADS) often struggle in complex urban environments because generic testing fails to capture city-specific traffic patterns and behaviors. To address this, we propose a city-adaptive testing framework that systematically evaluates ADS robustness by integrating spatiotemporal traffic prediction and multi-agent behavioral modeling. Our approach first introduces a novel traffic prediction model, called T-DDSTGCN, which combines graph and hypergraph representations to accurately forecast segment-level traffic speed and intersection turning probabilities. It achieves the best performance on both METR-LA and PEMS-BAY datasets, demonstrating its superior ability to capture spatiotemporal dependencies in traffic prediction tasks. Based on the predicted urban traffic flow, we construct diverse simulation scenarios enriched by a behavioral modeling framework called Primary Other Participants (POP), which simulates realistic motorcycle behavior using Level-K game theory and Social Value Orientation. To enhance scenario diversity, we further apply structured perturbations across traffic density, weather, and agent interactions. Our methodology is validated across 180 real-world urban scenarios on three industrial-scale simulation platforms, yielding 662 critical collision cases after multiple rounds of testing. We have conducted an initial manual screening of the 662 simulated accident scenarios, finding that 88.1\% of these accidents closely resemble real-world accident videos and reports. Furthermore, ablation studies highlight the critical role of human-like agent behavior in exposing ADS failures. Our findings suggest that incorporating traffic context and behavioral diversity into simulation testing is crucial for ensuring ADS safety and robustness in real-world deployments.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 8928
Loading