Governing Open Vocabulary Data Leaks Using an Edge LLM through Programming by Example

Qiyu Li, Jinhe Wen, Haojian Jin

Published: 01 Jan 2024, Last Modified: 21 Oct 2025Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: A major concern with integrating large language model (LLM) services (e.g., ChatGPT) into workplaces is that employees may inadvertently leak sensitive information through their prompts. Since user prompts can involve arbitrary vocabularies, conventional data leak mitigation solutions, such as string-matching-based filtering, often fall short. We present GPTWall, a privacy firewall that helps internal admins create and manage policies to mitigate data leaks in prompts sent to external LLM services. GPTWall's key innovations are (1) introducing a lightweight LLM running on the edge to obfuscate target information in prompts and restore the information after receiving responses, and (2) helping admins author fine-grained disclosure policies through programming by example. We evaluated GPTWall with 12 participants and found that they could create an average of 17.7 policies within 30 minutes, achieving an increase of 29% in precision and 22% in recall over the state-of-the-art data de-identification tool.