Adversarial Evaluation of Transformers as Soft ReasonersDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=9TPxZUpI_CM
Paper Type: Short paper (up to four pages of content + unlimited references and appendices)
Abstract: The RuleTaker models (Clark et al., 2020; Tafjord et al., 2020) have recently shown that transformers can be capable of learning to deductively inference over facts and rules provided as natural language sentences. This is a significant achievement, since they can eliminate the need for express the knowledge in a formal representation. In this paper, we evaluate the robustness of these models to adversarial attacks. We first investigate the availability of dataset biases: superficial cues which can be exploited by the models to obtain high accuracies without solving the task. We train a model on partial inputs, ignoring some parts that are essential for true reasoning. High accuracy obtained by this model reveals the existence of dataset biases. To examine possible inattention of the models to the necessary preconditions for valid reasoning, we present three adversarial attacks on the test set: ReplaceMid that replaces a word in the theory, AddMid which adds a new word to the theory, and ChangePolarity that negates one sentence. In our adversarial settings, the accuracy drops from an average of 97.55% to 67.10%. This highlights the need for development of more robust models in both logic and language complexity scopes.
0 Replies

Loading