Unified QA-aware Knowledge Graph Generation Based on Multi-modal ModelingOpen Website

2022 (modified: 24 Oct 2022)ACM Multimedia 2022Readers: Everyone
Abstract: Understanding the long duration videos' storyline is often considered a major challenge in the field of video understanding. To promote research on understanding longer videos in the community, the deep video understanding (DVU) task is suggested for recognizing interactions at the scene level and relationships at the movie level, as well as answering questions at these two levels. In this work, we propose a unified QA-aware knowledge graph generation approach, which consists of the relation-centric graph and interaction-centric graph and demonstrates the powerful performance of multimodal pre-training models in solving such problems. Extensive validations on the HLVU dataset demonstrate the effectiveness of our proposed method.
0 Replies

Loading