cgDDI: Controllable Generation of Diverse Dermatological Imagery for Fair and Efficient Malignancy Classification

ICLR 2026 Conference Submission18753 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Dermatology, Fairness, Synthetic, Generative
Abstract: Skin diseases impact the lives of millions of people around the world from different backgrounds and ethnicities. Therefore, accurate diagnosis in the dermatological domain requires focused work toward fairness in different skin-toned populations. However, a significant lack of expertly annotated dermatological images, especially those describing underrepresented skin tones and rare diseases, slows progress toward broadly accurate models and clear fairness metrics. In this work, we introduce **C**ontrollable **G**eneration of **D**iverse **D**ermatological **I**magery (**cgDDI**), a method capable of (1) synthesizing pixel-perfect in-distribution healthy samples, (2) lesion-mapping extremely rare lesions onto novel skin-tone combinations without training and (3) efficient high-fidelity parametric generation with as few as $10$ training samples. Our approach is controllable via learned disease-specific prompts or skin tone descriptors, either visually or textually, allowing for selection of key sensitive attributes. We leverage cgDDI to grow a $656$ real-image dataset by more than $400\times$. The resulting skin-tone-balanced dataset enables the development of accurate classification systems along with significant improvement on essential fairness metrics. Malignancy classification experiments on the Diverse Dermatology Images (DDI) benchmark shows our method reaches competitive performance ($86.4$% accuracy) when trained exclusively on our synthetic data and state-of-the-art performance ($90.9$% accuracy) when fine-tuned on real data. Additionally, we achieve leading metrics for Predictive Quality Disparity, Demographic Disparity, Equality of Opportunity as well as equitable generative image quality measurements for underrepresented skin-tones and rare diseases. We publish code, model weights, and generated datasets at https://anonymous.4open.science/r/ControllableGenDDI in support of further research in this direction.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 18753
Loading