UnICLAM: Contrastive representation learning with adversarial masking for unified and interpretable Medical Vision Question Answering

Published: 01 Jan 2025, Last Modified: 25 Jul 2025Medical Image Anal. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose UnICLAM, a unified Medical-VQA framework with joint alignment and learning.•We introduce adversarial masking for data augmentation and improved cross-modal alignment.•Experimental results show our model outperforms others in prediction and interpretability.•We explore Medical-VQA’s role in heart failure diagnosis and its few-shot adaptation.
Loading