Abstract: Visual semantic context describes the relationship between objects and their environment in images. Analyzing this context yields important cues for more holistic scene understanding. While visual semantic context is often learned implicitly, this work proposes a simple algorithm to obtain explicit priors and utilizes them in two ways: Firstly, irrelevant images are filtered during data aggregation, a key step to improving domain coverage especially for public datasets. Secondly, context is used to predict the domains of objects of interest. The framework is applied to the context around airplanes from ADE20K-SceneParsing, COCO-Stuff and PASCAL-Context. As intermediate results, the context statistics were obtained to guide design and mapping choices for the merged dataset SemanticAircraft and image patches were manually annotated in a one-hot manner across four aerial domains. Three different methods predict domains of airplanes: An original threshold-algorithm and unsupervised clustering models use context priors, a supervised CNN works on input images with domain labels. All three models were able to achieve acceptable prediction results, with the CNN obtaining accuracies of $$69\%$$ to $$85\%$$ . Additionally, context statistics and applied clustering models provide data introspection enabling a deeper understanding of the visual content.
0 Replies
Loading