How Lightweight Deep Learning Enhances Performance in DPU-Accelerated Autonomous Driving on Zynq SoC

Gyu Hyeon Hwang, HoBin Oh, Jae Wook Jeon

Published: 01 Jan 2025, Last Modified: 04 Aug 2025ICMRE 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This study presents a lightweight deep learning model developed for DPU-accelerated systems. It aims to provide real-time autonomous driving on resource-constrained systems such as the Ultra96v2. A customized kids electric car served as the platform. Custom power supply and steering control systems were set up in the car to enable real-world testing. To enhance inference performance, various methods were used. These included input size reduction, channel-pruning, and quantization. As a consequence, the pruned and quantized YOLOv3-Tiny model produced a frame rate of 67.592 FPS. This is roughly a 25x increase over the original YOLOv3's 2.715 FPS on Ultra96v2's PL domain. These results show that real-time deployment is feasible on FPGA-based platforms. The work offers insights for creating efficient and scalable embedded systems for self-driving vehicle system.

External IDs:dblp:conf/icmre/HwangOJ25