Building Scalable Video Understanding Benchmarks through Sports

ICLR 2024 Workshop DMLR Submission23 Authors

Published: 04 Mar 2024, Last Modified: 02 May 2024DMLR @ ICLR 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: automated annotation, dataset, video understanding
TL;DR: We introduce ASAP, a pipeline for aligning unlabeled videos with corresponding freely available dense web annotations and create LCric, a large-scale long video understanding benchmark, with over 1000 hours of densely annotated long Cricket videos.
Abstract: Existing benchmarks for evaluating long video understanding falls short on two critical aspects, either lacking in scale or quality of annotations. These limitations arise from the difficulty in collecting dense annotations for long videos, which often require manually labeling each frame. In this work, we introduce an automated Annotation and Video Stream Alignment Pipeline (abbreviated ASAP). We demonstrate the generality of ASAP by aligning unlabeled videos of four different sports with corresponding freely available dense web annotations (i.e. commentary). We then leverage ASAP’s scalability to create LCric, a large-scale long video understanding benchmark, with over 1000 hours of densely annotated long Cricket videos (with an average sample length of $\sim$50 mins) collected at virtually zero annotation cost. We benchmark and analyze state-of-the-art video understanding models on LCric through a large set of compositional multi-choice and regression queries. We establish a human baseline that indicates significant room for new research to explore. Our human studies indicate that ASAP can align videos and annotations with high fidelity, precision, and speed. The dataset along with the code for ASAP and baselines will be publicly released.
Primary Subject Area: Data collection and benchmarking techniques
Paper Type: Research paper: up to 8 pages
Participation Mode: Virtual
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 23
Loading