Abstract: Multi-Object detection in traffic scenarios plays a crucial role in ensuring the safety of people and property, as well as facilitating the smooth flow of traffic on roads. However, the existing algorithms are inefficient in detecting real scenarios due to the following drawbacks: (1) a scarcity of traffic scene datasets; (2) a lack of tailoring for specific scenarios; and (3) high computational complexity, which hinders practical use. In this paper, we propose a solution to eliminate these drawbacks. Specifically, we introduce a Full-Scene Traffic Dataset (FSTD) with Spatio-temporal features that includes multiple views, multiple scenes, and multiple objectives. Additionally, we propose the improved YOLOv7 model with redesigned BiFusion, NWD and SPPFCSPC modules (BNF-YOLOv7), which is a lightweight and efficient approach that addresses the intricacies of multi-object detection in traffic scenarios. BNF-YOLOv7 is achieved through several improvements over YOLOv7, including the use of the BiFusion feature fusion module, the NWD approach, and the redesign of the loss function. First, we improve the SPPCSPC structure to obtain SPPFCSPC, which maintains the same receptive field while achieving speedup. Second, we use the BiFusion feature fusion module to enhance feature representation capability and improve positional information of objects. Additionally, we introduce NWD and redesign the loss function to address the detection of tiny objects in traffic scenarios. Experiments on the FSTD and UA-DETRAC dataset show that BNF-YOLOv7 outperforms other algorithms with a 3.3% increase in mAP on FSTD and a 2.4% increase on UA-DETRAC. Additionally, BNF-YOLOv7 maintains significantly better real-time performance, increasing the FPS by 10% in real scenarios.
Loading