UnICLAM: Contrastive representation learning with adversarial masking for unified and interpretable Medical Vision Question Answering

Chenlu Zhan, Peng Peng, Hongwei Wang, Gaoang Wang, Yu Lin, Tao Chen, Hongsen Wang

Published: 2025, Last Modified: 25 Jul 2025Medical Image Anal. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We propose UnICLAM, a unified Medical-VQA framework with joint alignment and learning.•We introduce adversarial masking for data augmentation and improved cross-modal alignment.•Experimental results show our model outperforms others in prediction and interpretability.•We explore Medical-VQA’s role in heart failure diagnosis and its few-shot adaptation.