Abstract: Panoptic perception systems are critical for autonomous driving, as they process multiple visual tasks simultaneously, enhancing functionality. Compared with designing multiple independent networks to address various tasks, these systems exhibit reduced overall inference latency by integrating various tasks into a single network. Existing panoptic perception networks often rely on pre-trained classification models as their backbone, which are not tailored for specific tasks, thereby compromising accuracy. To address this, we propose a dual-branch backbone and a wide perception segmentation head, enhancing the effectiveness of the network for autonomous driving applications. This enhanced network can simultaneously perform vehicle object detection, drivable area segmentation, and lane segmentation. Furthermore, to meet the stringent latency requirements of autonomous driving, we implement this network using an FPGA acceleration card. In our experiments using the challenging BDD100K dataset, the model significantly surpasses the baseline in accuracy for all tasks. To satisfy the increased real-time demands, the VCK5000 FPGA is used, which achieves inference speeds approximately 35.2 times faster than GPU-based deployments and about 41.5 times energy efficiency, providing significant advantages in resource-constrained scenarios.
External IDs:dblp:journals/tiv/YangXKS25
Loading