{
  "metadata": {
    "forum_id": "Sklgs0NFvr",
    "review_id": "HJgaPkj0YH",
    "rebuttal_id": "S1eSrnDqsB",
    "title": "Learning The Difference That Makes A Difference With Counterfactually-Augmented Data",
    "reviewer": "AnonReviewer3",
    "rating": 8,
    "conference": "ICLR2020",
    "permalink": "https://openreview.net/forum?id=Sklgs0NFvr&noteId=S1eSrnDqsB",
    "annotator": "anno7"
  },
  "review_sentences": [
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 0,
      "text": "This paper seeks to separate \"causal\" features from ones with spurious correlations in the context of natural language machine learning tasks.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 1,
      "text": "The proposed approach is to ask human annotators to alter examples in a minimal way that changes the label.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 2,
      "text": "Thereby the humans separate out the causal features (those changed) from the spurious or irrelevant features (those left unchanged).",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_originality",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 3,
      "text": "Experiments show that classifiers trained on the original data perform poorly on the altered data and vice versa, but (unsurprisingly) training on the union of the two datasets results in a classifier that performs well in both cases.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 4,
      "text": "Furthermore, training an SVM on the original results in irrelevant attributes (such as movie genre) being weighted, whereas these weights are largely removed when training on the union of the datasets.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 5,
      "text": "This suggests that the augmented training data results in weighting the \"right\" features more.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 6,
      "text": "Overall, I think this paper should be accepted because it makes several interesting contributions: It proposes an interesting approach, shows intriguing experimental results, and produces an interesting dataset (size ~2k) that may be useful for future testing.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_meaningful-comparison",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 7,
      "text": "The main limitation of the paper is that the evidence is largely circumstantial.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_result",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 8,
      "text": "The method has intuitive appeal and the experimental results are suggestive, but the experiments do not conclusively show that the method achieves something that ordinary machine learning does not.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "HJgaPkj0YH",
      "sentence_index": 9,
      "text": "My suggestion for a further experiment would be to apply the movie review classifiers to, say, book reviews -- something where the task is fundamentally the same but the context is different. If the classifier trained on the union of the original and altered datasets performs better than a classifier trained on only on dataset, then that is strong evidence that this approach yields better extrapolation.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "HJgaPkj0YH",
      "rebuttal_id": "S1eSrnDqsB",
      "sentence_index": 0,
      "text": "We thank the reviewer for positive feedback and for championing our paper.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_accept-praise",
      "alignment": [
        "context_none",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaPkj0YH",
      "rebuttal_id": "S1eSrnDqsB",
      "sentence_index": 1,
      "text": "We are also grateful for your constructive suggestions to improve the paper and would like to report on how we have incorporated your feedback.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_none",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaPkj0YH",
      "rebuttal_id": "S1eSrnDqsB",
      "sentence_index": 2,
      "text": "Inspired by your suggestion, we conducted additional experiments on Amazon Reviews, Yelp Reviews, and Semeval (Twitter) datasets, and found that the counterfactually-augmented data resulted in across-the-board gains.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "HJgaPkj0YH",
      "rebuttal_id": "S1eSrnDqsB",
      "sentence_index": 3,
      "text": "These experiments are featured in the updated draft.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    }
  ]
}