LM-YOLO: Lightweight Multi Perception Network for Traffic Driving Scene Understanding

Haoran Liu, Wenbo Liu, Chunyu Zhao, Shengqi Chen, Fei Yan, Tao Deng

Published: 01 Jan 2025, Last Modified: 15 Nov 2025CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: The rapid progress in autonomous driving has led to increasing requirements for perception systems that are not only lightweight and efficient but also highly accurate, considering the restricted computational resources available on embedded platforms. In this paper, a multi-task network model leveraging an encoder-decoder architecture is presented in this work. The architecture uses a common backbone network, three distinct neck components, and specialized task heads to handle one detection task (traffic object detection) and two segmentation tasks (drivable area segmentation and lane detection). To enhance the model’s adaptability to complex traffic environments, we design a geometric deformation adaptation strategy. Furthermore, we delve into the specific requirements of each task and develop tailored feature fusion strategies to improve the model’s performance. We present two versions of the model, named n (nano) and s(small), among which the nano version contains only 3.46M parameters, significantly smaller than existing models such as YOLOP. The proposed model is evaluated on the challenging BDD100k dataset. The results indicate that our model achieves a mAP50 score of 82.3% in traffic object detection, a mIoU of 91.4% in drivable area segmentation, and an IoU of 29.1% in lane detection. Compared with existing models, our approach strikes a better balance between high accuracy, fewer parameters, and faster inference speed, making it more competitive for real-world applications.

External IDs:doi:10.1007/978-981-96-9961-2_10