{
  "metadata": {
    "forum_id": "Hyx4knR9Ym",
    "review_id": "H1x3aUom2X",
    "rebuttal_id": "H1eRd3dyCQ",
    "title": "Generalizable Adversarial Training via Spectral Normalization",
    "reviewer": "AnonReviewer2",
    "rating": 6,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=Hyx4knR9Ym&noteId=H1eRd3dyCQ",
    "annotator": "anno13"
  },
  "review_sentences": [
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 0,
      "text": "This paper is well set-up to target the interesting problem of degraded generalisation after adversarial training.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 1,
      "text": "The proposal of applying spectral normalisation (SN) is well motivated, and is supported by margin-based bounds.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 2,
      "text": "However, the experimental results are weak in justifying the paper's claims.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 3,
      "text": "Pros:",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 4,
      "text": "* The problem is interesting and well explained",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_clarity",
      "polarity": "pol_positive"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 5,
      "text": "* The proposed method is clearly motivated",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 6,
      "text": "* The proposal looks theoretically solid",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_positive"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 7,
      "text": "Cons:",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 8,
      "text": "* It is unclear to me whether the \"efficient method for SN in convolutional nets\" is more efficient than the power iteration algorithm employed in previous work, such as Miyato et al. 2018, which also used SN in conv nets with different strides.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 9,
      "text": "There is no direct comparison of performance.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_meaningful-comparison",
      "polarity": "pol_negative"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 10,
      "text": "* Fig. 3 needs more explanation.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_clarity",
      "polarity": "none"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 11,
      "text": "The horizontal axes are unlabelled, and \"margin normalization\" is confusing when shown together with SN without an explanation.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_edit",
      "aspect": "asp_clarity",
      "polarity": "none"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 12,
      "text": "Perhaps it's helpful to briefly introduce it in addition to citing Bartlett et al. 2017.",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_edit",
      "aspect": "asp_meaningful-comparison",
      "polarity": "none"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 13,
      "text": "* The epsilons in Fig. 5 have very different scales (0 - 0.5 vs. 0 - 5). Are these relevant to the specific algorithms and why?",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_clarity",
      "polarity": "none"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 14,
      "text": "* Section 5.3 (Fig. 6) is the part most relevant to the generalisation problem.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 15,
      "text": "However, the results are unconvincing: only the results for epsilon = 0.1 are shown, and even so the advantage is marginal.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 16,
      "text": "Furthermore, the baseline models did not use other almost standard regularisation techniques (weight decay, dropout, batch-norm).",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_meaningful-comparison",
      "polarity": "pol_negative"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 17,
      "text": "It is thus unclear whether the advantage can be maintained after applying these standard regularsisers.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "H1x3aUom2X",
      "sentence_index": 18,
      "text": "A typo in page 6, last line: wth -> with",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_typo",
      "aspect": "arg_other",
      "polarity": "none"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 0,
      "text": "We thank Reviewer 2 for the constructive feedback.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 1,
      "text": "Here is our point-to-point response to the comments and questions raised in the review:",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 2,
      "text": "1. \u201cIt is unclear to me whether the \"efficient method for SN in convolutional nets\" is more efficient than the power iteration algorithm employed in previous work, such as Miyato et al. 2018, which also used SN in conv nets with different strides. There is no direct comparison of performance.\u201d",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 3,
      "text": "We do not claim that our method is more efficient than Miyato et al.\u2019s method, which uses the spectral norm of the convolution kernel matrix to approximate the spectral norm of the convolution operation.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_contradict-assertion",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 4,
      "text": "In fact, our proposed method is computationally more expensive than their approximate scheme because each power iteration in our method requires a conv/deconv operation rather than a simple division used by Miyato et al.\u2019s.",
      "suffix": "\n\n",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_contradict-assertion",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 5,
      "text": "We introduce our new spectral normalization scheme for convolutional layers because there exist examples where the true spectral norm of a convolution operation can be arbitrarily larger than Miyato et al.\u2019s approximation.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_contradict-assertion",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 6,
      "text": "Therefore, Miyato et al.\u2019s normalization scheme is not guaranteed to control the spectral norm of convolutional layers which is critical for controlling a DNN\u2019s generalization performance (please see our generalization bounds in Section 3).",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_contradict-assertion",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 7,
      "text": "To further support our argument, we performed additional experiments demonstrating how our proposed method better controls the spectral norm of convolution layers, resulting in better generalization and test performance.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_contradict-assertion",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 8,
      "text": "The results are presented in Appendix A.1.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 9,
      "text": "Furthermore, we run several experiments to show that our method is not significantly slower than Miyato et al.\u2019s method, and we report the results in Appendix A.1, Table 3.",
      "suffix": "\n\n",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 10,
      "text": "2. \u201cFig. 3 needs more explanation. The horizontal axes are unlabelled, and \"margin normalization\" is confusing\u201d",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          10,
          11,
          12
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 11,
      "text": "We relabel the axes and add a more thorough explanation in the caption.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          10,
          11,
          12
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 12,
      "text": "We note that the text explaining Figure 3 mentions how the margin normalization is performed (paragraph 3 in section 5.1): the margin normalization factor is exactly the capacity norm \\Phi described in Theorems 1-4.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          10,
          11,
          12
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 13,
      "text": "We clarify that we divide the obtained margins by the values of \\Phi estimated on the dataset.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          10,
          11,
          12
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 14,
      "text": "3. \u201cThe epsilons in Fig. 5 have very different scales (0 - 0.5 vs. 0 - 5). Are these relevant to the specific algorithms and why?\u201d",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 15,
      "text": "Yes, the epsilons are chosen to be different depending on whether we are looking at norm_inf attacks or norm_2 attacks.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 16,
      "text": "This is because the two norms can behave very differently in adversarial attack experiments.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 17,
      "text": "For example, a norm_inf attack of 0.5 implies that all pixels can be changed by 0.5.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 18,
      "text": "On the other hand, a norm_2 attack of 0.5 means the overall Euclidean norm of perturbation across all pixels is bounded by 0.5, resulting in a much less powerful attack.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 19,
      "text": "Based on this comment, we update the plots with the same attack-norm to have the same scale.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 20,
      "text": "4. \"Section 5.3 (Fig. 6) is the part most relevant to the generalisation problem.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          14,
          15,
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 21,
      "text": "However, the results are unconvincing: only the results for epsilon = 0.1 are shown, and even so the advantage is marginal.\"",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          14,
          15,
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 22,
      "text": "We redo the visualization in Figure 6 to make the gains provided by SN clearer.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          14,
          15,
          16,
          17
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 23,
      "text": "We see that using SN can improve the test performance by over 12% for some FGM, PGM, and WRM cases.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          14,
          15,
          16,
          17
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 24,
      "text": "5. \"The baseline models did not use other almost standard regularisation techniques (weight decay, dropout, batch-norm).",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 25,
      "text": "It is thus unclear whether the advantage can be maintained after applying these standard regularisers.\"",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 26,
      "text": "We did not originally discuss weight decay, dropout, and batch normalization as none of these methods were motivated by the theory we introduced in section 3.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 27,
      "text": "However, due to the reviewers\u2019 concern in the updated draft we compare spectrally-normalized networks to networks with the same architecture except with weight decay, dropout, or batch norm in Appendix A.2.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          16,
          17
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "H1x3aUom2X",
      "rebuttal_id": "H1eRd3dyCQ",
      "sentence_index": 28,
      "text": "In our experiments, the SN-regularized network still performs better in terms of test accuracy.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          16,
          17
        ]
      ],
      "details": {}
    }
  ]
}