{
  "metadata": {
    "forum_id": "SyMDXnCcF7",
    "review_id": "r1lORJDq3m",
    "rebuttal_id": "H1gQCvdvTm",
    "title": "A Mean Field Theory of Batch Normalization",
    "reviewer": "AnonReviewer2",
    "rating": 6,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=SyMDXnCcF7&noteId=H1gQCvdvTm",
    "annotator": "anno2"
  },
  "review_sentences": [
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 0,
      "text": "This paper investigates the effect of the batch normalization in DNN learning.",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 1,
      "text": "The mean field theory in statistical mechanics was employed to analyze the",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 2,
      "text": "progress of variance matrices between layers.",
      "suffix": "\n",
      "review_action": "none",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 3,
      "text": "As the results, the batch normalization itself is found to be the cause of gradient explosion.",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 4,
      "text": "Moreover, the authors pointed out that near-linear activation function can improve such gradient explosion.",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 5,
      "text": "Some numerical studies were reported to confirm theoretical findings.",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 6,
      "text": "The detailed analysis of the training of DNN with the batch normalization is quite interesting.",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 7,
      "text": "There are some minor comments below.",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 8,
      "text": "- in page 3, 2line above eq(2): what is delta in the variance of the multivariate normal distribution?",
      "suffix": "\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_clarification",
      "aspect": "asp_clarity",
      "polarity": "none"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 9,
      "text": "- the notation q appeared in the middle part of page 3 before the definition of q is shown in the last paragraph of p.3.",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 10,
      "text": "- The randomized weight is not very practical. Though it may be the standard approach of mean field,",
      "suffix": "\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_edit",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "r1lORJDq3m",
      "sentence_index": 11,
      "text": "some comments would be helpful to the readers.",
      "suffix": "",
      "review_action": "none",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 0,
      "text": "Thank you for your review and very useful comments! We\u2019re happy you found our manuscript interesting.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 1,
      "text": "To address your comments:",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 2,
      "text": "1) Thank you for pointing out that we had not defined the delta.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_concede-criticism",
      "alignment": [
        "context_sentences",
        [
          8
        ]
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 3,
      "text": "Here delta is the Kronecker delta defined so that \\delta_{a,b} = 1 if a = b and 0 if a != b. In the context of the variance of the multivariate normal distribution, the delta function indicates that the different neurons in each layer have zero covariance.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          8
        ]
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 4,
      "text": "We\u2019ll add an explicit discussion of this fact to the manuscript.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          8
        ]
      ],
      "details": {
        "manuscript_change": true
      }
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 5,
      "text": "2) Thanks for pointing this out, we\u2019ll correct it in the next revision.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {
        "manuscript_change": true
      }
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 6,
      "text": "3) It is true that the extent to which randomized weights describe trained networks is unclear.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_concede-criticism",
      "alignment": [
        "context_sentences",
        [
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 7,
      "text": "However, it is true that most commonly used weight initialization schemes are random.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 8,
      "text": "For example, He initialization [1] and Xavier initialization [2] strategies are both special cases of the setup considered here.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 9,
      "text": "We therefore view our theory as a theory of neural networks at initialization.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 10,
      "text": "(There are, however, initialization schemes that are not random and that are not described by our theory).",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 11,
      "text": "[1] K. He, X. Zhang, S. Ren, J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. (http://www.cv-foundation.org/openaccess/content_iccv_2015/html/He_Delving_Deep_into_ICCV_2015_paper.html)",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_other",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "r1lORJDq3m",
      "rebuttal_id": "H1gQCvdvTm",
      "sentence_index": 12,
      "text": "[2] X. Glorot, Y. Bengio, Y. W. Teh, M. Titterington. Understanding the difficulty of training deep feedforward neural networks. (http://proceedings.mlr.press/v9/glorot10a.html)",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_other",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    }
  ]
}