Abstract: In multi-label image classification tasks, recent studies often exploit Graph Convolutional Networks(GCNs) to construct category label dependencies. However, existing GCN-based methods have two major drawbacks. First, the co-occurrence relationships contained in the GCN adjacency matrix constructed only from the dataset label statistics are not comprehensive enough, and a fixed adjacency matrix may reduce the generalization of the model. Second, GCN may suffer from over-smoothing during node updates. To solve these problems, we propose a Multi-Label classification model based on Adaptive Knowledge Graph (ML-AKG). ML-AKG consists of the following parts: (1) We adopt an adaptive adjacency matrix constructed based on the knowledge graph to obtain better category label dependencies. (2) To alleviate the over-smoothing and gradient vanishing problems of the GCN model, we add a residual connection structure between the input and output of the GCN layer. (3) A pre-trained multimodal model is introduced to replace the traditional CNN as the image encoder. We conducted extensive experiments on public multi-label image classification benchmarks, and the experimental results verified the effectiveness of our method. Our model achieves 80.1%, 94.1% and 94.6% mAPs on the MS-COCO, VOC 2007 and VOC 2012, respectively.
Loading