A Fast Hybrid Model for Large-Scale Zero-Shot Image Recognition Based on Knowledge Graphs

Bo Xiao, Yujiao Du, Q. M. Jonathan Wu, Qianfang Xu, Liping Yan

2019 (modified: 15 Nov 2021)IEEE Access 2019Readers: Everyone

Abstract: Zero-shot learning aims to recognize unseen categories by learning an embedding space between data samples and semantic representations. For the large-scale datasets with thousands of categories, embedding vectors of category labels are often used for semantic representation since it is difficult to define the semantic attributes of categories manually. Facing the problem of underutilization of prior knowledge during the construction of embedding vectors, this paper first constructs a novel knowledge graph as the supplement to the basic WordNet graph, and then proposes a fast hybrid model ARGCN-DKG, which means Attention based Residual Graph Convolutional Network on Different types of Knowledge Graphs. By introducing residual mechanism and attention mechanism, and integrating different knowledge graphs, the accuracy of knowledge transfer between different categories can be improved. Our model only use 2-layer GCN, the pretrained image features and category semantic features, so the training process could be done in minitues on single GPU, which could be one of the fastest training models for large-scale image recognition. Experiment results demonstrate that ARGCN-DKG model could get better results for large-scale datasets than the state-of-the-art model.

0 Replies