Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification

Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification

ACL ARR 2025 May Submission1876 Authors

18 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recent advancements in large language models (LLMs) have been fueled by large-scale training corpora drawn from diverse sources such as websites, news articles, and books. These datasets often contain explicit user information, such as person names, addresses, that LLMs may unintentionally reproduce in their generated outputs. Beyond such explicit content, LLMs can also leak identity-revealing cues through implicit signals such as distinctive writing styles, raising significant concerns about authorship privacy. There are three major automated tasks in authorship privacy, namely authorship obfuscation (AO), authorship mimicking (AM), and authorship verification (AV). Prior research has studied AO, AM, and AV independently. However, their interplays remain under-explored, which leaves a major research gap, especially in the era of LLMs, where they are profoundly shaping how we curate and share user-generated content, and the distinction between machine‑generated and human‑authored text is also increasingly blurred. This work then presents the first unified framework for analyzing the dynamic relationships among LLM-enabled AO, AM, and AV in the context of authorship privacy. We quantify how they interact with each other to transform human‑authored text, examining effects at a single point in time and iteratively over time. We also examine the role of demographic metadata, such as gender, academic background, in modulating their performances, inter-task dynamics, and privacy risks. All source code will be publicly available.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: large language models, authorship privacy

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study, Publicly available software and/or pre-trained models, Data resources

Languages Studied: English

Submission Number: 1876

Loading