End-to-end dense video grounding via parallel regression

Published: 01 Jan 2024, Last Modified: 13 Nov 2024Comput. Vis. Image Underst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We cast VG as a direct regression problem and present a simple yet effective framework (PRVG) for dense VG with accurate and efficient inference.•We design a robust and scale-invariant proposal-level attention loss function to guide the training of PRVG for better performance.•Extensive experiments demonstrate the superiority of PRVG and the effectiveness of parallel decoding paradigm on dense video grounding task.
Loading