Bridging asymmetry between image and video: Cross-modality knowledge transfer based on learning from video

Published: 2025, Last Modified: 04 Nov 2025Expert Syst. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Enhance action-semantic learning in the image modality to address the asymmetry between modalities.•Transfer action semantics from video to the image feature learning model from both global and local perspectives.•Our model performance improves by 3.9% and 4.2% compared to the SOTA model.
Loading