Incorporating Visual Grounding In GCN For Zero-shot Learning Of Human Object Interaction Actions

Chinmaya Devaraj, Cornelia Fermüller, Yiannis Aloimonos

Published: 2023, Last Modified: 05 Nov 2023CVPR Workshops 2023Readers: Everyone

Abstract: GCN-based zero-shot learning approaches commonly use fixed input graphs representing external knowledge that usually comes from language. However, such input graphs fail to incorporate the visual domain nuances. We introduce a method to ground the external knowledge graph visually. The method is demonstrated on a novel concept of grouping actions according to a shared notion and shown to be of superior performance in zero-shot action recognition on two challenging human manipulation action datasets, the EPIC Kitchens dataset, and the Charades dataset. We further show that visually grounding the knowledge graph enhances the performance of GCNs when an adversarial attack corrupts the input graph.

0 Replies