[Proposal-ML] Towards Safer T2I Generation by Refining Implicit Prompts

30 Oct 2024 (modified: 05 Nov 2024)THU 2024 Fall AML SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Stable diffusion, AI safety, Prompt refinement
Abstract: With the advancement of Text-to-Image (T2I) technology, high-quality images can now be created effortlessly from arbitrary human-written prompts by models like Stable Diffusion and DALL-E, capturing widespread attention and unprecedented popularity. However, this flourishing T2I community has also led to increasing privacy concerns, particularly regarding celebrity privacy. The unauthorized generation of celebrity images can lead to the spread of misinformation and the destruction of reputation. To tackle this problem, many T2I models are equipped with an addition safety checker to filter out user prompts containing celebrity names. Such filters offer a simple yet effective approach to reduce the privacy threats. However, implicit prompts, which obviously suggest a celebrity figure without directly containing the name, pose a more subtle risk to privacy. Therefore, a corresponding strategy is urgently required to handle implicit prompts. In this project, we firstly design a method to enhance the defense ability of T2I models to implicit prompts. Moreover, to go beyond than simply rejecting implicit prompts, we propose to conduct refinement on user prompts that preserve the most information while not lead to final images of celebrities. We hope this project could make certain contribution to the responsible application of T2I technology and foster advancements in ethical AI practices.
Submission Number: 43
Loading