Rare, Distinctive, Memorized: Auditing Memorization in Fine-Tuned Medical Foundation Models

Santhosh Parampottupadam; Sinem Sav; Dimitrios Bounias; Saikat Roy; Klaus Maier-Hein; Adam Dziedzic; Franziska Boenisch; Ralf Floca

Rare, Distinctive, Memorized: Auditing Memorization in Fine-Tuned Medical Foundation Models

Santhosh Parampottupadam, Sinem Sav, Dimitrios Bounias, Saikat Roy, Klaus Maier-Hein, Adam Dziedzic, Franziska Boenisch, Ralf Floca

Published: 04 Jun 2026, Last Modified: 04 Jun 2026ICML MemFM 2026 Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: memorization, medical AI, privacy, membership inference, visual distinctiveness, rare diseases, differential privacy

TL;DR: In fine-tuned medical foundation models, memorization tracks class rarity times visual distinctiveness; medical pretraining does not blunt this, so rare visually distinctive patients face the highest re-identification risk unless DP-LoRA is applied.

Abstract: Medical foundation models fine-tuned on private patient data risk leaking individual training samples; in this regime a single recoverable image is a privacy violation. We audit per-class memorization in fine-tuned medical foundation models with a loss-difference memorization score: each canary's loss is compared between a model trained with the canary and an otherwise identical model trained without it. Memorization occurs in every architecture-dataset cell we test. Interestingly, the rarest class is not the most memorized: the rarity-monotonicity assumed by prior memorization work outside medical imaging is broken in this regime. Instead, visual distinctiveness contributes to memorization beyond rarity in our setting: the same controlled grayscale intervention applied to two classes at comparable rarity shifts memorization in opposite directions, and on a balanced dataset $M(x)$ rises from a near-zero baseline (0.004) to high memorization (0.480) under the same intervention. DP-LoRA at $\epsilon = 1$ reduces leakage from the most-memorized class by 90% in our setup while preserving usable accuracy on focused diagnostic tasks, with the protection driven by the DP noise rather than the parameter restriction. These findings point towards more privacy-preserving adaptation of medical foundation models.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 28

Loading