Low Resource Reconstruction Attacks Through Benign Prompts

Published: 11 Jun 2025, Last Modified: 13 Jul 2025MemFMEveryoneRevisionsBibTeXCC BY-SA 4.0
Keywords: Reconstruction Attacks, Copyright, Privacy, Memorization, diffusion models
TL;DR: We devise an attack that can extract images from training data (including real people) using seemingly benign prompt such as "“Abstract Art Unisex T-Shirt"
Abstract: The rising popularity of diffusion models, have raised serious concerns around privacy, copyright, and data leakage. Prior work has demonstrated that training data can be partially reconstructed, but these attacks often require significant resources, training set access, or carefully crafted prompts. In this work, we present a low-resource attack that reveals a more subtle risk: even seemingly innocuous prompts can lead to the unintended reconstruction of real training images. Strikingly, we show that prompts like “Abstract Art Unisex T-Shirt” can generate identifiable human faces included in the training data. Our findings point to a systemic vulnerability rooted in the use of scraped e-commerce data, where templated layouts tightly couple visual content with prompt patterns. This raises new concerns about the ease with which unintentional data leaks may occur.
Submission Number: 3
Loading