Abstract: Highlights•Propose modules to enhance spatio-temporal modeling for few-shot action recognition tasks.•Temporal Enhancement Adaptation improves temporal feature extraction in videos.•Spatio-Temporal Fusion Adaptation integrates spatial and temporal features effectively.•Text-Enhanced Prototype Module fuses textual and visual data for better prototype quality.•Achieves competitive results on benchmarks with minimal trainable parameters.
External IDs:dblp:journals/ijon/ZhangYSL25
Loading