FairMed-VLM: Toward Equitable Medical Di- agnosis with Vision–Language Models
Track: Main Papers Track (6 to 9 pages)
Keywords: Fairness, Visual-Langauge Model, Medical
Abstract: Vision--Language Models (VLMs) show promise for medical diagnostics but may encode demographic biases that exacerbate healthcare disparities. We present \textbf{FairMed-VLM}, a fairness-aware fine-tuning framework that combines biomedical concept alignment, instruction tuning, and a two-part fairness objective (cross-group supervised contrastive + label-conditioned moment alignment). Across six medical imaging benchmarks, FairMed-VLM improves macro accuracy from 73.1\% to 78.3\% (+5.2 points; +7.1\% relative) while reducing the macro demographic parity gap from 14.2\% to 6.2\% (–56.3\%). On the OL3I opportunistic CT dataset, accuracy rises from 73.6\% to 78.5\% and the parity gap decreases from 13.1\% to 5.0\% (–61.8\%). Ablation studies show balanced sampling and the fairness objective contribute complementary gains. These results demonstrate that fairness and accuracy can be jointly advanced in medical vision--language models.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 1
Loading