Toggle navigation
OpenReview
.net
Login
×
Back to
IJCAI
IJCAI 2025 Workshop MKLM Submissions
Multi-Modal Interpretability for Enhanced Localization in Vision-Language Models
Muhammad Imran
,
Yugyung Lee
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
Representation Learning with Adaptive Superpixel Coding
Mahmoud Khalil
,
Ahmad Khalil
,
Alioune Ngom
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
MMDU-Bench: Multi-modal Deep Unlearning Benchmark
Ziyang Zhang
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning
Nilay Pande
,
Sahiti Yerramilli
,
Jayant Sravan Tamarapalli
,
Rynaa Grover
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
HueManity: Probing Fine-Grained Visual Perception in MLLMs
Rynaa Grover
,
Jayant Sravan Tamarapalli
,
Sahiti Yerramilli
,
Nilay Pande
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
Commonsense Storage Reasoning in Domestic Scenes: A Challenge for Vision-Language Models
Michaela Levi Richter
,
Oren Glickman
,
Reuth Mirsky
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
ManeuverVLM: A Novel Multimodal Fusion of Scene Images and Temporal Signals for Maneuver Prediction
Roksana Yahyaabadi
,
Soodeh Nikan
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
End-to-End RAW Synergy for Elevated Vision-Language Reasoning
Kepeng Xu
,
Tong Qiao
,
Zhenyang Liu
,
Li Xu
,
Gang He
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
Ontology-Guided Prompting for Reasoning in Multimodal Vision-Language Models: An Application to Rare Dental Disease
Kareem elgohary
,
Ali Ayadi
,
KAWCZYNSKI Marzena
,
Agnes BLOCH-ZUPAN
,
Cédric Wemmert
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning
Yihong Tang
,
Ao Qu
,
Zhaokai Wang
,
Dingyi Zhuang
,
Zhaofeng Wu
,
Wei Ma
,
Shenhao Wang
,
Yunhan Zheng
,
Zhan Zhao
,
Jinhua Zhao
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
MET-Bench: Multimodal Entity Tracking for Evaluating the Limitations of Vision-Language and Reasoning Models
Vanya Cohen
,
Ray Mooney
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
Can Large Vision Language Models Understand Sarcasm?
Xinyu Wang
,
Yue Zhang
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone
MemeBlip2: A Novel Light Weight Multimodal System to Detect Harmful Memes
Ran Tong
,
Jiaqi Liu
,
Aowei Shen
,
Shuzheng Li
,
Changlin Yang
,
Lisha Xu
Published: 14 Jun 2025, Last Modified: 16 Aug 2025
MKLM 2025
Readers:
Everyone