Multi-label Image Recognition Based on Multi-modal Graph Convolutional Networks Using Captioning FeaturesDownload PDFOpen Website

Published: 2021, Last Modified: 12 May 2023GCCE 2021Readers: Everyone
Abstract: This paper presents a multi-label image recognition model based on multi-modal Graph Convolutional Networks (GCN) using captioning features. GCN have been frequently exploited to model label dependencies in recent studies of multi-label image recognition. However, the dependencies are built on information directly from the images. In this paper, we imports captioning features into the GCN-based multi-label image recognition model to use both image and text information. By introducing image captioning into the GCN model, the accuracy of multi-label image recognition can be improved. The experiment results show the effectiveness of the proposed method.
0 Replies

Loading