{
  "metadata": {
    "forum_id": "Hkx-ii05FQ",
    "review_id": "BJxHwvW23X",
    "rebuttal_id": "Skl8N_xipQ",
    "title": "The Cakewalk Method",
    "reviewer": "AnonReviewer1",
    "rating": 4,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=Hkx-ii05FQ&noteId=Skl8N_xipQ",
    "annotator": "anno2"
  },
  "review_sentences": [
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 0,
      "text": "The authors argue that not knowing the distribution of rewards observed in the policy gradient algorithm hinders learning (and the tuning process).",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 1,
      "text": "They propose to replace the reward term in the policy gradient algorithm with its centered empirical cumulative distribution, which has a fixed and known U[-1, 1] distribution.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 2,
      "text": "They test their methods on a toy task that consists in finding inclusion maximal cliques (which tests for local optimality) against REINFORCE (including their variants: centering the rewards with a mean baseline or normalizing them), the cross-entropy method and Exp3.",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 3,
      "text": "I think that the current draft lacks strong experimental results to properly demonstrate the usefulness of the method.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 4,
      "text": "The method is only evaluated on a single task and many confounding variables (the design of the reward function, factorizing the parametric distribution into marginals, reporting results for a single (non-tuned?) learning rate, etc.) make evaluation difficult.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 5,
      "text": "The usefulness of the approach is also lessened by the greater importance of the choice of the optimizer.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 6,
      "text": "I would like the method to be applied on other domains such as continuous non-convex optimization and reinforcement learning.",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_motivation-impact",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 7,
      "text": "Additionally, I find the motivation for caring about local optimality unconvincing.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 8,
      "text": "I take exception that people care more about local optimality than the actual objective.",
      "suffix": "",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 9,
      "text": "From a practical point of view, local optimality is a mean (that can be achieved via heuristic algorithms) to an end (the objective itself).",
      "suffix": "",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 10,
      "text": "This also holds for k-means, which is usually run multiple times with different starting conditions.",
      "suffix": "\n\n",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 11,
      "text": "Some comments:",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 12,
      "text": "- Table 3 is a bit confusing as-is (lower is only better when controlling on the quality of the best sample.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 13,
      "text": "e.g: REINFORCE has lower best-sample to total-sample ratio but its solutions are worse)",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 14,
      "text": "- It isn't clear from the tables that OCE_0.1 outperforms REINFORCE_Z (as is mentioned in the discussion).",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 15,
      "text": "- The paper should refer to 1) the reward shaping literature, 2) the growing line of works concerned with control variates for REINFORCE (such as VIMCO, MuProp, REBAR) and 3) the growing line of works concerned about combinatorial optimization with reinforcement learning (Neural Combinatorial Optimization with Reinforcement Learning, etc.)",
      "suffix": "\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_edit",
      "aspect": "asp_meaningful-comparison",
      "polarity": "pol_negative"
    },
    {
      "review_id": "BJxHwvW23X",
      "sentence_index": 16,
      "text": "- I would also encourage the authors to come up with a more descriptive name for the approach.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_edit",
      "aspect": "asp_soundness-correctness",
      "polarity": "none"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 0,
      "text": "We thank the reviewer for their evaluation.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 1,
      "text": "Please see our response at https://openreview.net/forum?id=Hkx-ii05FQ&noteId=HygFbNmL6X, where we also discuss our experimental framework.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_none",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 2,
      "text": "Even though we present results on two tasks, it appears the paper structure doesn\u2019t convey this clearly, and we suggest two possible ways how to update the paper in this regard.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_none",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 3,
      "text": "We note in this context that even though we would also like to see Cakewalk evaluated on the domains mentioned by the reviewer, these are not part of our own research agenda, and accordingly our suggestions refer to other problems in combinatorial optimization.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-request",
      "alignment": [
        "context_none",
        null
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 4,
      "text": "Next, we address other issues raised by the reviewer.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 5,
      "text": "First, we\u2019d like to emphasize that the clique problem studied in the paper is far from being a toy problem.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 6,
      "text": "All the algorithms are evaluated on the DIMACS clique dataset which was published as part of the second DIMACS challenge which specifically focused on combinatorial optimization.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 7,
      "text": "Over the years, this dataset has become a standard benchmark for clique finding algorithms, and results on it are regularly published.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 8,
      "text": "In this respect, this dataset is an important benchmark for clique algorithms very much like CIFAR10 and CIFAR100 are for image classification methods.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 9,
      "text": "Notably, Cakewalk approaches the performance of the best clique finding algorithms that directly search a graph, and which are tailored to this specific task.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 10,
      "text": "Note that none of the tested methods were given enough samples even to recover the graph itself, as most graphs have more than 100 nodes, and we\u2019ve allowed only for 100 |V| samples in each execution.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 11,
      "text": "To us this seems as a rather challenging setup, not just for the algorithms we\u2019ve tested in this paper, but for any clique finding algorithm.",
      "suffix": "\n\n",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-request",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {
        "request_out_of_scope": false
      }
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 12,
      "text": "Next, we wonder how would the reviewer correct the confounds mentioned with regard to the clique experiment.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 13,
      "text": "Providing a controlled experiment is always challenging, though the elements mentioned by the reviewer were specifically selected as to reduce various confounds.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 14,
      "text": "The main research question we try to address is whether algorithms that only rely on function evaluations can recover locally optimal solutions.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 15,
      "text": "Since the objective is the only source of information for such algorithms, an all-or-none kind of objective would not be very useful.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 16,
      "text": "Instead, the objective is designed in a manner that provides information even for partial solutions, thus allowing the tested algorithms to gradually improve the objective.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 17,
      "text": "In terms of the sampling distribution, as our focus is on the update step, we decided to use the simplest possible sampling distribution we can think of.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 18,
      "text": "In such a regime, we can attribute any performance gains to the algorithms themselves, and not to any prior knowledge that is reflected by the structure of some complex sampling distribution.",
      "suffix": "\n\n",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          3,
          4,
          5,
          6
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 19,
      "text": "Next, we agree that local optimality is a mean rather than a goal (the objective itself).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_concede-criticism",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 20,
      "text": "Nonetheless, as in the problems we seek to address the global optimum cannot be found in polynomial time",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 21,
      "text": ", the second best approach is first to design a method that can recover locally optimal solutions.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 22,
      "text": "Once such a method is available, repeated applications of that method can allow one to select a good solution, very much like the standard practice of repeated applications of k-means which the reviewer mentions.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 23,
      "text": "This reasoning however is dependent on a method\u2019s capability of recovering locally optimal solutions, and therefore studying this ability makes for a worthwhile effort.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          7,
          8,
          9,
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 24,
      "text": "Answers to the last comments:",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 25,
      "text": "- Table 3 is indeed confusing, this is a good point. We will correct it.",
      "suffix": "\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          12,
          13
        ]
      ],
      "details": {
        "manuscript_change": true
      }
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 26,
      "text": "- Methods that apply a surrogate objective work best with AdaGrad.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 27,
      "text": "In this case, our data is a classical use which is explored in the AdaGrad paper uses as a motivating example.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 28,
      "text": "Not surprisingly, both Cakewalk and OCE work best with it.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 29,
      "text": "REINFORCE however is sensitive to the objective values, and it appears that Adam somewhat mitigates this problem.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 30,
      "text": "However, this is not as effective as applying a surrogate objective, and REINF_Z with Adam is outperformed by OCE_0.1 with AdaGrad in all measures.",
      "suffix": "\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 31,
      "text": "- Our frame of reference were algorithms that could be applied to any combinatorial problem, and which only rely on function evaluations.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_refute-question",
      "alignment": [
        "context_sentences",
        [
          15
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 32,
      "text": "Control variates and reward shaping methods are mostly useful when tied to the particularities of a given objective, and thus do not fall into this category.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_refute-question",
      "alignment": [
        "context_sentences",
        [
          15
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 33,
      "text": "In neural combinatorial optimization the study is focused on designing a sampling distribution that reflects some prior knowledge about a problem, and thus, we consider this line of work as orthogonal to ours.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_refute-question",
      "alignment": [
        "context_sentences",
        [
          15
        ]
      ],
      "details": {}
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 34,
      "text": "Having said that, we see how these areas might seem related, and we will revise the related work section to better emphasize the aforementioned differences.",
      "suffix": "\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          15
        ]
      ],
      "details": {
        "manuscript_change": true
      }
    },
    {
      "review_id": "BJxHwvW23X",
      "rebuttal_id": "Skl8N_xipQ",
      "sentence_index": 35,
      "text": "- We selected the name \u2018Cakewalk\u2019 after consulting with a few colleagues. Following a joint discussion, we concluded that this name has the best chance for increasing our work\u2019s impact.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-request",
      "alignment": [
        "context_sentences",
        [
          16
        ]
      ],
      "details": {
        "request_out_of_scope": false
      }
    }
  ]
}