Keywords: Artificial Intelligence, Dermatology, Foundation Models, Health Equity in AI, Skin Lesion Classification, Skin Tone Bias, Generalizability, Algorithmic Fairness, Responsible AI in Healthcare
Abstract: Foundation models are reshaping medical AI by enabling efficient transfer learning from large, pretrained representations. In this work, we evaluate Google Health’s Derm Foundation Model for skin lesion classification and fairness in dermatologic imaging. Using pre-encoded embeddings from PAD-UFES-20 and DERM12345, we trained lightweight classifiers for five major conditions: Actinic Keratosis, Basal Cell Carcinoma, Malignant Melanoma, Squamous Cell Carcinoma, and Seborrheic Keratosis. The model achieved high AUCs and consistent performance across sex, age, and lesion characteristics, demonstrating the strength of foundation-model representations for dermatology. However, fairness analysis revealed noticeable lower sensitivity for darker Fitzpatrick skin tones (4-6), indicating bias embedded within the pretrained feature space. Applying importance weighting and group-balanced resampling helped mitigate but did not fully eliminate these disparities. Our findings highlight the need for more diverse pretraining datasets and fairness-aware adaptation strategies to ensure equitable deployment of foundation models in clinical AI applications.
Submission Number: 16
Loading