Abstract: Road-scene parsing is complex and changeable; the interferences in the background destroy the visual structure in the image data, increasing the difficulty of target detection. The key to addressing road-scene parsing is to amplify the feature differences between the targets, as well as those between the targets and the background. This paper proposes a novel scene-parsing network, Attentional Prototype-Matching Network (APMNet), to segment targets by matching candidate features with target prototypes regressed from labeled road-scene data. To obtain reliable target prototypes, we designed the Sample-Selection and the Class-Repellence Algorithm in the prototype-regression progress. Also, we built the class-to-class and target-to-background attention mechanisms to increase feature recognizability based on the target’s visual characteristics and spatial-target distribution. Experiments conducted on two road-scene datasets, CamVid and Cityscapes, demonstrate that our approach effectively improves the representation of targets and achieves impressive results compared with other approaches.
0 Replies
Loading