{
  "metadata": {
    "forum_id": "ryenvpEKDr",
    "review_id": "rkeru_QAYH",
    "rebuttal_id": "H1xrAsVPor",
    "title": "A Constructive Prediction of the Generalization Error Across Scales",
    "reviewer": "AnonReviewer3",
    "rating": 8,
    "conference": "ICLR2020",
    "permalink": "https://openreview.net/forum?id=ryenvpEKDr&noteId=H1xrAsVPor",
    "annotator": "anno10"
  },
  "review_sentences": [
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 0,
      "text": "This work proposes a functional form for the relationship between <dataset size, model size> and generalization error, and performs an empirical study to validate it.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 1,
      "text": "First, it states 5 criteria that such a functional form must take, and proposes one such functional form containing 6 free coefficients that satisfy all these criteria.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 2,
      "text": "It then performs a rigorous empirical study consisting of 6 image datasets and 3 text datasets, each with 2 distinct architectures defined at several model scales, and trained with different dataset sizes.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 3,
      "text": "This process produces 42-49 data points for each <dataset, architecture> pair, and the 6 coefficients of the proposed functional form are fit to those data points, with < 2% mean deviation in accuracy.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 4,
      "text": "It then studies how this functional form performs at extrapolation, and finds that it still performs pretty well, with ~4.5% mean deviation in accuracy, but with additional caveats.",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 5,
      "text": "Decision: Accept.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "arg_other",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 6,
      "text": "This paper states 5 necessary criteria for any functional form for generalization error predictor that jointly considers dataset size and model size, then empirically verifies it with multiple datasets and architectures.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 7,
      "text": "These criteria are well justified, and can be used by others to narrow down the search for functions that approximate the generalization error of NNs without access to the true data distribution, which is a significant contribution.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 8,
      "text": "The empirical study is carefully done (e.g., taking care to subsample the dataset in a way that preserves the class distribution).",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 9,
      "text": "I also liked that the paper is candid about its own limitations.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "arg_other",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 10,
      "text": "A weakness that one might perceive is that the coefficients of the proposed functional form still needs to be fit to 40-ish trained NNs for every dataset and training hyperparameters, but I do not think this should be held against this work, because a generalization error predictor (let alone its functional form) that works for multiple datasets and architecture without training is difficult, and the paper does include several proposals for how this can still be used in practice.",
      "suffix": "\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 11,
      "text": "(Caveat: the use of the envelope function described in equation 5 (page 6) is not something I am familiar with, but seems reasonable.)",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 12,
      "text": "Issues to address:",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 13,
      "text": "- Fitting 6 parameters to 42-49 data points raises concerns about overfitting.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 14,
      "text": "Consider doing cross validation over those 42-49 data points, and report the mean of deviations computed on the test folds.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 15,
      "text": "The extrapolation section did provide evidence that there probably isn't /that/ much overfitting, but cross validation would directly address this concern.",
      "suffix": "\n",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 16,
      "text": "- In addition, the paper provides the standard deviation for the mean deviations over 100 fits of the function as the measure of its uncertainty, but I suspect that the optimizer converging to different coefficients at different runs isn't the main source of uncertainty.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 17,
      "text": "A bigger source of uncertainty is likely due to there being a limited amount of data to fit the coefficients to.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 18,
      "text": "Taking the standard deviation over the deviations measured on different folds of the data would be better measure of uncertainty.",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 19,
      "text": "Minor issues:",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rkeru_QAYH",
      "sentence_index": 20,
      "text": "- Page 8: \"differntiable methods for NAS.\" differentiable is misspelled.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_typo",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 0,
      "text": "Thank you for your thorough and helpful review.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 1,
      "text": "We also believe that the criteria we identified will be useful for others in narrowing the search for functions that approximate the generalization error of NNs in realistic settings with no access to the true data distribution.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_summary",
      "alignment": [
        "context_sentences",
        [
          10
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 2,
      "text": "Concerns regarding overfitting and uncertainty estimation: Given your suggestion, we performed 10-fold cross validation in all tasks and found high quality results and cross-fold consistency.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          13,
          14
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 3,
      "text": "We now report updated cross-val for all results in section 6 including figures 3,4 and in the newly-added figure 5.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          13,
          14
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 4,
      "text": "We believe that this addresses both the overfitting concern and the uncertainty estimation concern.",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_summary",
      "alignment": [
        "context_sentences",
        [
          13,
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 5,
      "text": "Additional evidence that there is no overfitting is the good extrapolation results (section 7), as acknowledged by the reviewer.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_summary",
      "alignment": [
        "context_sentences",
        [
          13,
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 6,
      "text": "Regarding the envelope function (equation 5): This form of function is a simple case of the (complex) rational function family (simple pole at $\\eta$, simple zero at the origin in this case).",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_summary",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 7,
      "text": "This family arises naturally in transitory systems in control theory and electrical engineering, e.g., when considering the frequency response of systems.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_summary",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 8,
      "text": "It captures naturally powerlaw transitions.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_summary",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 9,
      "text": "With that said, as we stress in the end of section 5, the particular choice of envelope is merely a convenience one and there may be other such functions / refinements.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_summary",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 10,
      "text": "We leave further exploration of this aspect to future work.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_future",
      "alignment": [
        "context_sentences",
        [
          11
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 11,
      "text": "We have fixed the misspelling in \u201cdifferentiable\u201d.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          20
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "rkeru_QAYH",
      "rebuttal_id": "H1xrAsVPor",
      "sentence_index": 12,
      "text": "Thanks for pointing this out.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_concede-criticism",
      "alignment": [
        "context_sentences",
        [
          20
        ]
      ],
      "details": {}
    }
  ]
}