Motion-guided token prioritization and semantic degradation fusion for exo-to-ego cross-view video generation

Weipeng Hu, Jiun Tian Hoe, Runzhong Zhang, Yiming Yang, Haifeng Hu, Yap-Peng Tan

Published: 2025, Last Modified: 28 Jul 2025Inf. Fusion 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•A novel video-based method, TPDF (motion-guided Token Prioritization and semantic Degradation Fusion), is proposed for cue-free E2VG task.•MSPT and MTPT incorporate motion cues and orthogonal constraints to adaptively identify informative tokens, ensuring spatial–temporal consistency generation.•The SDF progressively learns egocentric semantics through a degradation learning mechanism.•By developing cascaded cross-self attention framework, the designed CPD effectively compensates for the degradation of egocentric semantic information and incorporate informative tokens at different granularities.•The TPDF achieves state-of-the-art performance in the cue-free E2VG task.