Processing Text for Privacy: An Information Flow Perspective

Published: 2018, Last Modified: 21 Jan 2026FM 2018EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The problem of text document obfuscation is to provide an automated mechanism which is able to make accessible the content of a text document without revealing the identity of its writer. This is more challenging than it seems, because an adversary equipped with powerful machine learning mechanisms is able to identify authorship (with good accuracy) where, for example, the name of the author has been redacted. Current obfuscation methods are ad hoc and have been shown to provide weak protection against such adversaries. Differential privacy, which is able to provide strong guarantees of privacy in some domains, has been thought not to be applicable to text processing.
Loading