Abstract: In this paper we present Semantic Stixels, a novel vision-based scene model geared towards automated driving. Our model jointly infers the geometric and semantic layout of a scene and provides a compact yet rich abstraction of both cues using Stixels as primitive elements. Geometric information is incorporated into our model in terms of pixel-level disparity maps derived from stereo vision. For semantics, we leverage a modern deep learning-based scene labeling approach that provides an object class label for each pixel. Our experiments involve an in-depth analysis and a comprehensive assessment of the constituent parts of our approach using three public benchmark datasets. We evaluate the geometric and semantic accuracy of our model and analyze the underlying run-times and the complexity of the obtained representation. Our results indicate that the joint treatment of both cues on the Semantic Stixel level yields a highly compact environment representation while maintaining an accuracy comparable to the two individual pixel-level input data sources. Moreover, our framework compares favorably to related approaches in terms of computational costs and operates in real-time.
0 Replies
Loading