IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

ACL ARR 2024 June Submission4649 Authors

16 Jun 2024 (modified: 10 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In this work, we address the problem of text anonymization where the goal is to prevent adversaries from correctly inferring private attributes of the author, while keeping the text utility, i.e., meaning and semantics. We propose IncogniText, a technique that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value. Our empirical evaluation shows a reduction of private attribute leakage by more than $90\%$. Finally, we demonstrate the maturity of IncogniText for real-world applications by distilling its anonymization capability into a set of LoRA parameters associated with an on-device model.

Paper Type: Short

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: privacy, text anonymization, large language models

Languages Studied: English

Submission Number: 4649

Loading