End-to-end dense video grounding via parallel regression

Fengyuan Shi, Weilin Huang, Limin Wang

Published: 2024, Last Modified: 13 Nov 2024Comput. Vis. Image Underst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We cast VG as a direct regression problem and present a simple yet effective framework (PRVG) for dense VG with accurate and efficient inference.•We design a robust and scale-invariant proposal-level attention loss function to guide the training of PRVG for better performance.•Extensive experiments demonstrate the superiority of PRVG and the effectiveness of parallel decoding paradigm on dense video grounding task.