Multi-modal integrated proposal generation network for weakly supervised video moment retrieval

Published: 2025, Last Modified: 16 May 2025Expert Syst. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Propose MIPGN: A novel approach for improved video moment retrieval.•Frame clustering acquires scene context, guiding adaptive proposal generation.•Leverage pretrained models for multi-modal tag extraction, enriching feature representation.•Contrastive learning-based multi-task objective boosts model training efficiency.•Efficacy demonstrated through testing on Charades-STA and ActivityNet datasets.
Loading