Keywords: large language model, privacy, personally identifiable information, named entity detection
TL;DR: We show how LLMs can fail to recognize human names even in short text snippets due to contextual ambiguity.
Abstract: Large language models (LLMs) are increasingly being used to protect personal user data. These privacy solutions often assume that LLMs can reliably detect named entities and personally identifiable information (PII). In this paper, we challenge that assumption by revealing how LLMs can regularly overlook broad types of sensitive names even in short text snippets due to ambiguity in the contexts. We construct AMBENCH, a benchmark dataset of seemingly ambiguous yet real entity names designed around the name regularity bias phenomenon and embedded within concise text snippets containing benign prompt injections. Our experiments with state-of-the-art LLMs and specialized PII detection tools show that the recall of AMBENCH names drops by 20--40\% compared to more recognizable names. AMBENCH names are also four times more likely to be ignored in supposedly privacy-preserving LLM-powered text analysis tools adopted in the industry. Our findings showcase blind spots in current LLM-based privacy defenses and call for a systematic investigation into their privacy failure modes.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 19978
Loading