{
  "metadata": {
    "forum_id": "SyVU6s05K7",
    "review_id": "BJlCnxODhX",
    "rebuttal_id": "SJebCbtF67",
    "title": "Deep Frank-Wolfe For Neural Network Optimization",
    "reviewer": "AnonReviewer3",
    "rating": 8,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=SyVU6s05K7&noteId=SJebCbtF67",
    "annotator": "anno0"
  },
  "review_sentences": [
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 0,
      "text": "This paper introduced a proximal approach to optimize neural networks by linearizing the network output instead of the loss function.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 1,
      "text": "They demonstrate their algorithm on multi-class hinge loss, where they can show that optimal step size can be computed in close form without significant additional cost.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 2,
      "text": "Their experimental results showed competitive performance to SGD/Adam on the same network architectures.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 3,
      "text": "1.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 4,
      "text": "Figure 1 is crucial to the algorithm design as it aims to prove that Loss-Preserving Linearization (LPL) preserves information on loss function.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 5,
      "text": "While the authors provided numerical plots to compare it with the SGD linearization, I personally prefer to see some analytically comparsion between SGD linearization and LPL even on the simplest case.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 6,
      "text": "An appendix with more numerical comparisons on other loss functions might also be insightful.",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 7,
      "text": "2. It seems LPL is mainly compared to SGD for convergence (e.g. Fig 2).",
      "suffix": "",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 8,
      "text": "In Table 2 I saw some optimizers end up with much lower test accuracy.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJlCnxODhX",
      "sentence_index": 9,
      "text": "Can the authors show the convergence plots of these methods (similar to Figure 2)?",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_substance",
      "polarity": "none"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 0,
      "text": "We thank the reviewer for their comments and suggestions.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 1,
      "text": "We answer below:",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 2,
      "text": "1. As the reviewer accurately points out, we choose to always employ the hinge loss for DFW in this paper because it gives an optimal step-size.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 3,
      "text": "In the new version of the paper, we have included additional baselines on the SNLI data set.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          4,
          5,
          6
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 4,
      "text": "This provides more empirical comparisons between the performance of CE and SVM for different optimizers.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 5,
      "text": "2. In appendix B.2 of the paper, we have added the convergence plot for all methods on the CIFAR data sets.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 6,
      "text": "In some cases the training performance can show some oscillations.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 7,
      "text": "We emphasize that this is the result of cross-validating the initial learning rate based on the validation set performance: sometimes a better behaved convergence would be obtained on the training set with a lower learning rate.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJlCnxODhX",
      "rebuttal_id": "SJebCbtF67",
      "sentence_index": 8,
      "text": "However this lower learning rate is not selected because it does not provide the best validation performance (this is consistent with our discussion on the step size in section 6).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9
        ]
      ],
      "details": {}
    }
  ]
}