VP2Net: Visual Perception-Inspired Network for Exploring the Causes of Drivers’ Attention Shift

Chunyu Zhao, Tao Deng, Pengcheng Du, Wenbo Liu, Yi Huang, Fei Yan

Published: 01 Jan 2025, Last Modified: 09 Nov 2025IEEE Transactions on Intelligent Transportation SystemsEveryoneRevisionsCC BY-SA 4.0
Abstract: With the rapid development of autonomous driving technology, the recognition/understanding of driving events has become increasingly important for improving road safety. Existing methods for recognizing driving events rely solely on the inherent features of driving scenes, lacking real-time modeling of driver attention and the integration of driver attention for understanding driving events. Research has shown that understanding driver attention will be beneficial for subsequent analysis of driving events. We propose the attention-based driving event dataset (ADED), which includes rich driving scenes, eye movement data, reasons for attention shifts, and event time windows. It enables the use of prior information about driver attention to guide the recognition of driving events. Based on our dataset, we propose a visual dual-perception network, named VP2Net, to explore the reasons behind driver attention shifts. The goal of VP2Net is to use driver attention to guide the recognition of driving events. Inspired by the human visual dual cognition process mechanism, we build a bottom-up sequential information encoding branch for extracting spatio-temporal low-level information in the driving scene. Additionally, we establish a top-down attention perceptual encoding branch that simulates the driver’s high-level visual cognitive process. It not only captures the driver’s spatial attention allocation (“where to focus”) but also performs a temporal dimensional perceptual enhancement (“when to focus”), allowing us to extract the driver’s spatial attention enhancement information. We use the driver’s spatial attention enhancement information to guide the fusion of spatio-temporal information of the driving scene and selectively highlight the core objects/areas in the current driving task/event. Finally, we compare our proposed model with other SOTA networks and visualize the results of the key components of the model. Our code is available at https://github.com/zhao-chunyu/VP2Net
Loading