Unlearnable Text for Neural ClassifiersDownload PDF

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone
Abstract: Neural text classification models are known to001explore statistical patterns during supervised002learning. However, such patterns include spurious patterns and superficial regularity in the004training data. In this paper, we exaggerate superficial regularity in the text to prevent unau-006thorized exploration of personal data.007We propose a gradient-based method to construct text modifications, which can make deep009neural networks (DNNs) unlearnable.We010then analyze text modifications exposed by the gradient-based method and further propose012two simple hypotheses to manually craft unlearnable text. Experiments on four tasks (sen-014timent classification, topic classification, read-015ing comprehension and gender classification validate the effectiveness of our method, by which these hypotheses achieve almost un-018trained performance after training on unlearn-019able text.
Paper Type: long
0 Replies

Loading