Leaky Diffusion: Attribute Leakage in Text-Guided Image Generation

Anastasios Lepipas, Marios Charalambides, Jiani Liu, Yiying Guan, Dominika Woszczyk, Mansi, Thanh Hai Le, Soteris Demetriou

Published: 06 May 2025, Last Modified: 07 Apr 2026PoPETS 2025EveryoneCC BY 4.0

Abstract: Text–guided diffusion models can be used to generate photorealistic images conditioned on natural language instructions. Due to their ease of use, millions of users already leverage them to generate and populate images online. In this work, we reveal the risk of attribute (authorship and dementia) leakage from such models. Existing authorship and dementia inferences rely primarily on text. We show that instructions are a new form of text that can reveal these attributes. More surprisingly, and in contrast to prior work, we show that those attributes can be transferred and leaked from images generated with diffusion models. In particular, we construct image and multi–modal adversarial models which leverage image data augmentation and text–image embedding models to achieve state of the art performance in spear authorship inference (up to 0.877% Top–5 accuracy for 100 authors), while dementia inference is possible even from the output images alone (0.75% accuracy on the ADReSS dataset). Our rigorous evaluation shows that such inferences remain robust using different training sets, and when trained in classifier-independent ways, and against SOTA mitigations such paraphrasing Transformer models and LLMs.