Abstract: Understanding the road structure is essential for achieving autonomous driving. This intricate topic contains two fundamental components - the interconnections between lanes and the associations between lanes and traffic elements (e.g., traffic lights), where a comprehensive topology reasoning method is still absent. On one hand, existing map learning techniques face challenges in deriving lane connectivity using segmentation or laneline-based representations; or prior approaches focus on centerline detection while neglecting interaction modeling. On the other hand, the topic of assigning traffic elements to lanes is limited in the image domain, leaving the construction of correspondence between image and 3D views as an unexplored challenge. To address these issues, we present TopoNet, an end to-end topology reasoning network for analyzing driving scenes. To capture the topology of driving environments effectively, we introduce three key designs: (1) an embedding module that integrates semantic knowledge from 2D elements into a unified feature space; (2) acurated scene graph neural network that models relationships and facilitates feature interactions within the network; (3) instead of transmitting messages arbitrarily, a scene knowledge graph is devised to differentiate prior knowledge from various types of the scene topology. We evaluate TopoNet on the challenging scene understanding benchmark, OpenLane-V2, where our approach outperforms all previous works by a great margin across all perceptual and topological metrics. The code will be publicly released.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Matthew_Walter1
Submission Number: 2998
Loading