LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training.

Xiang An, Yin Xie, Kaicheng Yang 0002, Wenkang Zhang, Xiuwei Zhao, Zheng Cheng, Yirui Wang, Songcen Xu, Changrui Chen, Chunsheng Wu, Huajie Tan, Chunyuan Li, Jing Yang 0038, Jie Yu, Xiyao Wang, Bin Qin, Yumeng Wang, Zizhen Yan, Ziyong Feng, Ziwei Liu et al. (2 additional authors not shown)

12 Nov 2025CoRR 2025EveryoneCC BY-SA 4.0
Loading