IntelliCockpitBench: A Comprehensive Benchmark to Evaluate VLMs for Intelligent Cockpit

IntelliCockpitBench: A Comprehensive Benchmark to Evaluate VLMs for Intelligent Cockpit

ACL ARR 2025 February Submission3463 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The integration of sophisticated Vision-Language Models (VLMs) in vehicular systems is revolutionizing vehicle interaction and safety, performing tasks such as Visual Question Answering (VQA). However, a critical gap persists due to the lack of a comprehensive benchmark for multimodal VQA models in vehicular scenarios. To address this, we propose IntelliCockpitBench, a benchmark that encompasses diverse automotive scenarios. It includes images from front, side, and rear cameras, various road types, weather conditions, and interior views, integrating data from both moving and stationary states. Notably, all images and queries in the benchmark are verified for high levels of authenticity, ensuring the data accurately reflects real-world conditions. A sophisticated scoring methodology combining human and model-generated assessments enhances reliability and consistency. Our contributions include a diverse and authentic dataset for automotive VQA and a robust evaluation metric aligning human and machine assessments. All code and data can be found at \url{https://anonymous.4open.science/r/IntelliCockpitBench-2F2E/}.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Intelligent Cockpit, VLM Evaluation, Visual Question Answering

Contribution Types: Data resources

Languages Studied: English

Submission Number: 3463

Loading