Abstract: Perception and awareness of the surroundings are significant for autonomous vehicle navigation. To drive safely, autonomous systems must be able to extract spatial information and understand the semantic meaning of the environment. We propose a novel network architecture BEVSeg to generate the perception and semantic information by incorporating geometry-based and data-driven techniques into two respective modules. Specifically, the geometry-based aligned BEV domain data augmentation module addresses overfitting and misalignment issues by augmenting the coherent BEV feature map, aligning the augmented object and segmentation ground truths, and aligning augmented BEV feature map and its augmented ground truths. The data-driven hierarchy double-branch spatial attention module addresses the inflexibility of the BEV feature generation by learning multi-scale BEV features flexibly via the enlarged feature receptive field and learned interest regions. Experimental results on the nuScenes benchmark dataset demonstrate BEVSeg achieves state-of-the-art results with a higher mIoU of 3.6% than the baseline. Code and models will be released.
External IDs:dblp:conf/icvs/ChenTLWQ23
Loading