MGTR-MISS: More Ground Truth Retrieving based Multimodal Interaction and Semantic Supervision for video description

Jiayu Zhang, Pengjie Tang, Yunlan Tan, Hanli Wang

12 Nov 2025Neural NetworksEveryoneCC BY-SA 4.0
Loading