Multi-view Distillation based on Multi-modal Fusion for Few-shot Action Recognition(CLIP-M2DF).

Fei Guo 0010, Yikang Wang, Han Qi 0008, Wenping Jin, Li Zhu 0003

21 Jan 2026CoRR 2024EveryoneCC BY-SA 4.0
Loading