Modularized Self-Reflected Video Reasoner for Multimodal LLM with Application to Video Question Answering.

Zihan Song 0003, Xin Wang 0019, Zi Qian, Hong Chen 0011, Longtao Huang, Hui Xue 0001, Wenwu Zhu 0001

21 Jan 2026ICML 2025EveryoneCC BY-SA 4.0
Loading