Large-scale online deanonymization with LLMs

Simon Lermen; Daniel Paleka; Joshua Swanson; Michael Aerni; Nicholas Carlini; Florian Tramèr

Large-scale online deanonymization with LLMs

Simon Lermen, Daniel Paleka, Joshua Swanson, Michael Aerni, Nicholas Carlini, Florian Tramèr

Published: 01 Mar 2026, Last Modified: 24 Apr 2026ICLR 2026 AIWILDEveryoneRevisionsCC BY 4.0

Keywords: Deanonymization, Large language models, Privacy, User matching

Abstract: We show that large language models erode the privacy offered by online pseudonymous accounts by enabling fully automated and highly effective deanonymization attacks at scale. We first demonstrate that today's most capable LLM agents can deanonymize users in fully open-world settings, by autonomously searching the web, querying databases, and reasoning over evidence to identify real individuals from pseudonymous profiles alone. On the Anthropic Interviewer dataset, our agent replicates prior deanonymization results in minutes per target, matching what would take hours for a dedicated human investigator. We then design a scalable attack pipeline that uses LLMs to: (1) extract identity-relevant features from unstructured posts and comments, (2) match profiles via semantic search over large candidate sets, and (3) reason over top candidates to verify matches and reduce false positives. Unlike prior work that required structured data or manual feature engineering, our approach works directly on raw user content across arbitrary platforms. We build evaluation benchmarks with ground truth by exploiting naturally-occurring cross-platform links and by splitting single-user profiles into synthetic pairs. We evaluate on multiple settings: matching LinkedIn profiles to anonymized Hacker News accounts and linking users across Reddit movie discussion communities. In each setting, LLM-based methods substantially outperform classical baselines. Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered.

PDF: pdf

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 27

Loading