{
  "metadata": {
    "forum_id": "Hye64hA9tm",
    "review_id": "rklOwd45hm",
    "rebuttal_id": "BJgm9FTSCm",
    "title": "Measuring Density and Similarity of Task Relevant Information in Neural Representations",
    "reviewer": "AnonReviewer1",
    "rating": 5,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=Hye64hA9tm&noteId=BJgm9FTSCm",
    "annotator": "anno0"
  },
  "review_sentences": [
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 0,
      "text": "MEASURING DENSITY AND SIMILARITY OF TASK RELEVANT INFORMATION IN NEURAL REPRESENTATIONS",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_quote",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 1,
      "text": "Summary:",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 2,
      "text": "This work attempts to define two kinds of metrics (metrics for information density and for information similarity) for the sake of automatically detecting similarity between tasks so that transfer learning can be done more efficiently.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 3,
      "text": "The concepts are clearly explained, and the metric for information density seems to match up with intuitions coming out of forward selections approaches.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_clarity",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 4,
      "text": "The metric for information transfer seems to be the commonplace metric that other works default to when they show that pre-trained representations are effective on downstream tasks.",
      "suffix": "",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 5,
      "text": "It is not clear that the notion of similarity through classifier weights makes sense, but see below for clarification questions.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 6,
      "text": "The problem addressed (automatic similarity scoring of tasks) is important for transfer learning, and thus the results have potential to be very impactful if they generalize to other kinds of tasks; as is, they seem to apply only to classification tasks, but that is a good step.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 7,
      "text": "Pros:",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 8,
      "text": "Clearly written; experiments on the datasets chosen do seem to suggest that the proposed methods have potential.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 9,
      "text": "Brings in nice intuition from forward feature selection.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 10,
      "text": "An important problem with potential for high impact.",
      "suffix": "\n\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_motivation-impact",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 11,
      "text": "Cons:",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 12,
      "text": "It is not clear to me that the classifier difference metric is well-defined.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_replicability",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 13,
      "text": "Is there a constraint on the CFS and classifiers that ensure the difference between the weights really captures what is suggested?",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_clarification",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 14,
      "text": "Is it not the case that classifier weights could come out quite different despite the tasks being quite similar if the linear classifiers learned to capitalize on dissimilar, yet equally fruitful patterns in the input features?",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_clarification",
      "aspect": "asp_soundness-correctness",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 15,
      "text": "Do you have thoughts on how this could be applied outside the context of sentence representations and further outside the context of classification?",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_substance",
      "polarity": "none"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 16,
      "text": "Those seem to be quite limiting features of these methods, which is not to say that they are not useful in that realm, but only to clarify my understanding of their possible scope of application.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 17,
      "text": "These classification datasets are often so close, that I do wonder whether even simpler methods would work just as well.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 18,
      "text": "For example, clustering on bags-of-words might also show that SST, SST-fine, and IMDb are close/similar/transferable.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 19,
      "text": "The same could be said for SICK and SNLI.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 20,
      "text": "It would be nice to see a comparison to such baselines in order to get a sense of how the proposed methods give insights that other unsupervised or supervised methods might give just as well.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_meaningful-comparison",
      "polarity": "pol_positive"
    },
    {
      "review_id": "rklOwd45hm",
      "sentence_index": 21,
      "text": "Otherwise, it is hard to tell how significant these correlations are. Since the end goal is to determine transferability of tasks and not the methods, it does seem like there are simpler baselines that you could compare against.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_meaningful-comparison",
      "polarity": "pol_negative"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 0,
      "text": "We thank you for your thoughtful review. We are happy to learn that you believe it is an interesting direction that holds potential for high impact.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 1,
      "text": "Re: simpler methods (like clustering, BoW etc.) might work equally well",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          17,
          18
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 2,
      "text": ">",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          17,
          18
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 3,
      "text": "To assess transfer learning potential reliably, we require both the X and y for the target task (i.e supervision).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          17,
          18,
          19,
          20,
          21
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 4,
      "text": "Consider the case where the target task is sentiment analysis, and one of the candidate tasks is finding sentence length (SentLen).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          17,
          18,
          19,
          20,
          21
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 5,
      "text": "For the sake of the argument, let us assume that the X for both sentiment analysis and sentence length is exactly the same set of movie reviews.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          17,
          18,
          19,
          20,
          21
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 6,
      "text": "In such a case, unsupervised metrics like clustering, BoW etc. would indicate maximum transfer potential, whereas the actual transfer potential would be close to zero (assuming the lengths of reviews aren\u2019t correlated with the sentiment).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          17,
          18,
          19,
          20,
          21
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 7,
      "text": "This is a fundamental problem of measures that look directly at the input data X without considering the nature of the labels y.",
      "suffix": "\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          17,
          18,
          19,
          20,
          21
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 8,
      "text": "For the sake of completeness, we will compare our methods with the suggested baselines in the camera ready/future versions of the paper.",
      "suffix": "\n\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          17,
          18,
          19,
          20,
          21
        ]
      ],
      "details": {
        "manuscript_change": true
      }
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 9,
      "text": "Re: not clear if the classifier weight difference is well defined",
      "suffix": "\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_concede-criticism",
      "alignment": [
        "context_sentences",
        [
          12,
          13,
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 10,
      "text": "> You are right in noting that the classifier weights might capture dissimilar yet useful features for two similar tasks, and hence the classifier weight difference might under-predict the transfer potential.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_concede-criticism",
      "alignment": [
        "context_sentences",
        [
          12,
          13,
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 11,
      "text": "We discuss this issue in the paper (section 4.1), which is why we avoid the set overlap metric.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_mitigate-criticism",
      "alignment": [
        "context_sentences",
        [
          12,
          13,
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 12,
      "text": "Owing to similar concerns, we recommend using CFS information transfer metric over classifier weight difference (which is also supported by results in Table 2 and Figure 3).",
      "suffix": "\n\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          12,
          13,
          14
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 13,
      "text": "Re: thoughts on how this could be applied outside the context of sentence representations and classification",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          15,
          16
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 14,
      "text": "> It is easy to adopt our approach to study the information encoded in the encoders for other problems involving structured prediction (say POS Tagging).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          15,
          16
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 15,
      "text": "Instead of using a decoder that takes in all the dimensions of the encoded input token, one could iteratively select dimensions that provide the highest gains in decoding the right target sequence (say POS tags).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          15,
          16
        ]
      ],
      "details": {}
    },
    {
      "review_id": "rklOwd45hm",
      "rebuttal_id": "BJgm9FTSCm",
      "sentence_index": 16,
      "text": "Our formulation is very general, and it could potentially also be applied to other modalities like images for tasks like image classification and captioning.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          15,
          16
        ]
      ],
      "details": {}
    }
  ]
}