Abstract: Majority of existing semantic video understanding methods process every video independently without considering the underlying inter-video relationships. However, videos uploaded by individuals on social media platforms like YouTube, Instagram etc. exhibit inter-video relationship which are a reflection of individual’s interest, geography, culture etc. In this work, we explicitly attempt to model this inter-video relationship, originating from the creators of these videos using Graph Neural Networks (GNN) in a multimodal setup. We perform video classification by leveraging the creators of the videos and semantic similarity between for creating edges between videos and observe improvements of 4% in accuracy
1 Reply
Loading