A Framework for Visual Relation Detection Exploiting Global Context

Rui Sun, Xingfa Zhou, Xin Wang, Lan Yang, Huayi Zhan

Published: 01 Jan 2021, Last Modified: 28 Oct 2023CIS 2021Readers: Everyone

Abstract: Visual relation detection (VRD) is crucial for comprehensive image understanding which requires capturing the interactions between detected objects. However, the inference of the relations between objects is challenging due to the lack of richer context or semantic information. Most of the previous works on VRD focus on local context or simple semantic information. Attending to gather richer information, we develop a dismountable VRD framework that combines the global features and traditional local features from both vision and semantics. Specifically, we first investigate how to construct an efficient global context. Then a dual attention model (DAM) is proposed to gather necessary information from the global context in different modalities. At the reasoning stage, four kinds of features are fused to predict pairwise relations between objects in the image. Experimental results on Visual Genome (VG) dataset validate the effectiveness of our model.

0 Replies