Abstract: In this study, we propose modeling the context visually and semantically by combining a visual graph and a semantic graph and learning a vital context in the HOI problem using a group of graph update-modules, including graph inner update modules and graph cross update modules. We fuse the contextual features from the visual graph and semantic graph with the visual characteristics of the human-object pairs in a network to detect HOIs. We evaluate our proposed model on two challenging datasets, HICO-DET and V-COCO, and demonstrate excellent performance. Our work can provide a reference for modeling contextual information in the HOI problem.
0 Replies
Loading