{
  "metadata": {
    "forum_id": "ByME42AqK7",
    "review_id": "BygMkWst37",
    "rebuttal_id": "ryxN-nQVRX",
    "title": "Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution",
    "reviewer": "AnonReviewer2",
    "rating": 6,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=ByME42AqK7&noteId=ryxN-nQVRX",
    "annotator": "anno3"
  },
  "review_sentences": [
    {
      "review_id": "BygMkWst37",
      "sentence_index": 0,
      "text": "- Summary",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 1,
      "text": "This paper proposes a multi-objective evolutionary algorithm for the neural architecture search.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 2,
      "text": "Specifically, this paper employs a Lamarckian inheritance mechanism based on network morphism operations for speeding up the architecture search.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 3,
      "text": "The proposed method is evaluated on CIFAR-10 and ImageNet (64*64) datasets and compared with recent neural architecture search methods.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 4,
      "text": "In this paper, the proposed method aims at solving the multi-objective problem: validation error rate as a first objective and the number of parameters in a network as a second objective.",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 5,
      "text": "- Pros",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 6,
      "text": "- The proposed method does not require to be initialized with well-performing architectures.",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 7,
      "text": "- This paper proposes the approximate network morphisms to reduce the capacity of a network (e.g., removing a layer), which is reasonable property to control the size of a network for multi-objective problems.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_positive"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 8,
      "text": "- Cons",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 9,
      "text": "- Judging from Table 1, the proposed method does not seem to provide a large contribution.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 10,
      "text": "For example, while the proposed method introduced the regularization about the number of parameters to the optimization, NASNet V2 and ENAS outperform the proposed method in terms of the accuracy and the number of parameters.",
      "suffix": "\n",
      "review_action": "none",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 11,
      "text": "- It would be better to provide the details of the procedure of the proposed method (e.g., Algorithm 1 and each processing of Algorithm 1) in the paper, not in the Appendix.",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 12,
      "text": "- In the case of the search space II, how many GPU days does the proposed method require?",
      "suffix": "\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BygMkWst37",
      "sentence_index": 13,
      "text": "- About line 10 in Algorithm 1, how does the proposed method update the population P? Please elaborate on this procedure.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 0,
      "text": "Dear AnonReviewer2,",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 1,
      "text": "thank you for your constructive feedback. Below we address your concerns and questions.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 2,
      "text": "\u201cJudging from Table 1, the proposed method does not seem to provide a large contribution.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 3,
      "text": "For example, while the proposed method introduced the regularization about the number of parameters to the optimization, NASNet V2 and ENAS outperform the proposed method in terms of the accuracy and the number of parameters.\u201c",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 4,
      "text": "\u2192 The authors of NASNet only provide results for two regimes of parameters (3.3M and  27M) as they do not perform multi-objective optimization but rather just vary two parameters for building NASNet models (number of cells stacked, number of filters).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 5,
      "text": "Their method might be optimized to yield good results in these regimes and, admittedly, LEMONADE does not outperform NASNet for models with ~4M parameters.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 6,
      "text": "However, from Figure 3 and Table 2 one can see that only varying these two parameters for NASNet models is not necessarily sufficient to generate good models across all parameter regimes.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 7,
      "text": "E.g., LEMONADE clearly outperforms NASNet for very small models (50k params, 200k params - Table 2).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 8,
      "text": "We also refer to Appendix 3 (\u201cLEMONADE with 5 objectives\u201d), Figure 6, in the updated version of our paper, where one can see that while NASNet has quite strong performance in terms of error, number of parameters and number of multiply-add operations, it performs poorly in terms of inference time.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 9,
      "text": "Hence, there is a benefit in doing multi-objective optimization if one is actually interested in multiple objectives and diverse models rather than a single model.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 10,
      "text": "This is the main contribution of our paper and different to, e.g., the NASNet paper.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 11,
      "text": "The same likely also applies for ENAS (as they use the same search space and conduct very similar experiments).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 12,
      "text": "We also would like to highlight two things: 1) NASNet requires 40x computational resources than LEMONADE, so even if NASNet performs better for ~4M parameter models, LEMONADE achieves competitive performance in significantly less time.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 13,
      "text": "2) Table 1 shows results for models trained with different training pipelines and hyperparameters, and hence it is hard to say architecture X performs better than architecture Y since differences could simply be due to e.g. different learning rates, batch sizes, etc.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 14,
      "text": "In contrast, all other results in the paper (e.g., Figure 3 and Table 2) provide comparisons with exactly the same training pipeline and hyperparameters.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 15,
      "text": ".",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 16,
      "text": "\u201cIt would be better to provide the details of the procedure of the proposed method (e.g., Algorithm 1 and each processing of Algorithm 1) in the paper, not in the Appendix. \u201c",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 17,
      "text": "-> Thanks, we agree; we re-organized our paper accordingly.",
      "suffix": "\n\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 18,
      "text": "\u201c- In the case of the search space II, how many GPU days does the proposed method require?",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          12
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 19,
      "text": "-> We also ran this experiments for 7*8 GPU days, however the method converged after roughly 3*8 GPU days (meaning that there were no significant differences afterwards).",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          12
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 20,
      "text": "\u201cAbout line 10 in Algorithm 1, how does the proposed method update the population P? Please elaborate on this procedure.\u201d",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 21,
      "text": "-> The population is updated to be all non-dominated points from the current population and the generated children, i.e. the Pareto frontier based on all current models.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 22,
      "text": "We clarified this in Algorithm 1.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 23,
      "text": "Thanks for pointing us towards this.",
      "suffix": "\n\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_sentences",
        [
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BygMkWst37",
      "rebuttal_id": "ryxN-nQVRX",
      "sentence_index": 24,
      "text": "We hope this clarifies your questions. Thanks again for the review!",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    }
  ]
}