{
  "metadata": {
    "forum_id": "HJgOl3AqY7",
    "review_id": "HJgaRiJcnm",
    "rebuttal_id": "SyltxKsTpm",
    "title": "Modulated Variational Auto-Encoders for Many-to-Many Musical Timbre Transfer",
    "reviewer": "AnonReviewer1",
    "rating": 5,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=HJgOl3AqY7&noteId=SyltxKsTpm",
    "annotator": "anno9"
  },
  "review_sentences": [
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 0,
      "text": "Summary",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 1,
      "text": "-------",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 2,
      "text": "This paper describes a model for musical timbre transfer which builds on recent developments in domain- and style transfer.",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 3,
      "text": "The proposed method is designed to be many-to-many, and uses a single pair of encoders and decoders with additional conditioning inputs to select the source and target domains (timbres).",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 4,
      "text": "The method is evaluated on a collection of individual note-level recordings from 12 instruments, grouped into four families which are used as domains.",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 5,
      "text": "The method is compared against the UNIT model under a variety of training conditions, and evaluated for within-domain reconstruction and transfer accuracy as measured by maximum mean discrepancy.",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 6,
      "text": "The proposed model seems to improve on the transfer accuracy, with a slight hit to reconstruction accuracy.",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 7,
      "text": "Qualitative investigation demonstrates that the learned representation can approximate several coarse spectral descriptors of the target domains.",
      "suffix": "\n\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 8,
      "text": "High-level comments",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 9,
      "text": "-------------------",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 10,
      "text": "Overall, this paper is well written, and the various design choices seem well-motivated.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 11,
      "text": "The empirical comparisons to UNIT are reasonably thorough, though I would have preferred more in-depth evaluation of the MoVE model as well.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 12,
      "text": "Specifically, the authors introduced an extra input (control) to encode the pitch class and octave information during encoding.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 13,
      "text": "I infer that this was necessary to achieve good performance, but it would be instructive to see the results without this additional input, since it does in a sense constitute a form of supervision, and therefore limits the types of training data which can be used.",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_substance",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 14,
      "text": "While I understand that quantifying performance in this application is difficult, I do find the results difficult to interpret.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 15,
      "text": "Some of this comes down to incomplete definition of the metrics (see detailed comments below).",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 16,
      "text": "However, the more pressing issue is that evaluation is done either sample-wise within-domain (reconstruction), or distribution-wise across domains (transfer).",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 17,
      "text": "The transfer metrics (MMD and kNN) are opaque to the reader: for instance, in table 1, is a knn score of 43173 qualitatively different than 43180?",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_clarification",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_negative"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 18,
      "text": "What is the criteria for bolding here?",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_clarification",
      "aspect": "asp_soundness-correctness",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 19,
      "text": "It would be helpful if these scores could be calibrated in some way, e.g., with reference to",
      "suffix": "\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_result",
      "aspect": "asp_soundness-correctness",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 20,
      "text": "MMD/KNN scores of random partitions of the target domain samples.",
      "suffix": "\n\n",
      "review_action": "none",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 21,
      "text": "Since the authors do additional information here for each sample (notes), it would be possible to pair generated and real examples by instrument and note, rather than (in addition to) unsupervised, feature-space pairing by MMD.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_soundness-correctness",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 22,
      "text": "This could provide a slightly stronger version of the comparison in Figure 3, which shows that the overall distribution of spectral centroids is approximated by transfer, but does not demonstrate per-sample correspondence.",
      "suffix": "\n\n\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_experiment",
      "aspect": "asp_meaningful-comparison",
      "polarity": "pol_positive"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 23,
      "text": "Detailed comments",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 24,
      "text": "-----------------",
      "suffix": "\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_heading",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 25,
      "text": "At several points in the manuscript, the authors refer to \"invertible\" representations (e.g., page 4, just after eq. 1), but it seems like what they mean is approximately invertible or decodable.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_edit",
      "aspect": "asp_clarity",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 26,
      "text": "It would be better if the authors were a little more careful in their use of terminology here.",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_edit",
      "aspect": "asp_clarity",
      "polarity": "pol_negative"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 27,
      "text": "In the definition of the RBF kernel (page 4), why is there a summation?",
      "suffix": "\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_clarification",
      "aspect": "asp_clarity",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 28,
      "text": "What does this index? How are the kernel bandwidths defined?",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_clarification",
      "aspect": "asp_clarity",
      "polarity": "none"
    },
    {
      "review_id": "HJgaRiJcnm",
      "sentence_index": 29,
      "text": "How exactly are reconstruction errors calculated: using the NSGT magnitude representation, or after resynthesis in the time domain?",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_substance",
      "polarity": "none"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 0,
      "text": "Thank you for your detailed review and the constructive comments on our work.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 1,
      "text": "We note the remarks on the paper writing that we will correct and answer below the main points that were commented.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_concede-criticism",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 2,
      "text": "* In-depth evaluation of MoVE and comparison of with/without conditioning:",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          11,
          12
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 3,
      "text": "We agree and this was also pointed by 'AnonReviewer2', we are working on new incremental benchmarks, more detailed on both one-to-one and many-to-many models.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          11,
          12,
          13
        ]
      ],
      "details": {
        "manuscript_change": false
      }
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 4,
      "text": "Moreover, the need of pitch/octave conditioning limits the applicability of our model to transfer only on audio carrying such note features.",
      "suffix": "",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_contradict-assertion",
      "alignment": [
        "context_sentences",
        [
          11,
          12,
          13
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 5,
      "text": "Hence we trained models without conditioning mechanism and, as answered to 'AnonReviewer2', we are planning experiments on models which are conditional but integrating an unconditioned state to be trained in parallel of the note-conditional state.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          11,
          12,
          13
        ]
      ],
      "details": {
        "manuscript_change": false
      }
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 6,
      "text": "**",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_other",
      "alignment": [
        "context_none",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 7,
      "text": "*",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_other",
      "alignment": [
        "context_none",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 8,
      "text": "Interpretability of the generative scores:",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          14,
          15,
          19,
          20
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 9,
      "text": "We agree on this remark, the idea of scaling scores is right and would improve the interpretability of our benchmarks.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_future",
      "alignment": [
        "context_sentences",
        [
          14,
          15,
          19,
          20
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 10,
      "text": "For that purpose, we should define a set of reference scores as you recommended to.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_future",
      "alignment": [
        "context_sentences",
        [
          14,
          15,
          19,
          20
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 11,
      "text": "* Incomplete definition of the metrics:",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          15,
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 12,
      "text": "We gave references to the papers that introduced such metrics.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          15,
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 13,
      "text": "Discussing a set of reference scores should also come with a better explanation of these.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          15,
          16,
          17
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 14,
      "text": "* Criteria for bolding: we intended to highlight the best scores",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          18
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 15,
      "text": "**",
      "suffix": "",
      "rebuttal_stance": "other",
      "rebuttal_action": "rebuttal_none",
      "alignment": [
        "context_error",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 16,
      "text": "* Pairing generated and real examples by instrument and note to compare:",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          21,
          22
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 17,
      "text": "In addition to the spectral descriptor distribution plots, we used sample-specific scatter plots to visualize how the transfer maps them individually.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          21,
          22
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 18,
      "text": "On the overlap of each instrument tessitura, we can make such pairing.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          21,
          22
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 19,
      "text": "We can also transfer and transpose to the target instrument tessitura if needed.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          21,
          22
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 20,
      "text": "Remains the question of which metric can be used here to evaluate generation at the sample-level (?), as our model does not aim at reconstructing an hypothetical corresponding sample in the target domain but rather at blending in features from the other domain so that it sounds like the input note (pitch, octave but also some dynamics/style qualities relative to the input instrument) played by the target instrument.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          21,
          22
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 21,
      "text": "We later aim at experimenting on mechanisms to control the amount of target feature blending in the process of transfer.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          21,
          22
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 22,
      "text": "* Invertible ? Decodable ? Approximate inversion ?",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          25,
          26
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 23,
      "text": "We agree that the current state of the research should be stated as using approximate spectrogram inversion.",
      "suffix": "\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          25,
          26
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 24,
      "text": "We plan on replacing the iterative slow spectrogram inversion with Griffin-Lim by faster decoding with Multi-head Convolutional Neural Networks, arXiv:1808.06719, Sercan O. Arik et al.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_by-cr",
      "alignment": [
        "context_sentences",
        [
          25,
          26
        ]
      ],
      "details": {
        "manuscript_change": false
      }
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 25,
      "text": "*",
      "suffix": "",
      "rebuttal_stance": "other",
      "rebuttal_action": "rebuttal_none",
      "alignment": [
        "context_error",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 26,
      "text": "**",
      "suffix": "",
      "rebuttal_stance": "other",
      "rebuttal_action": "rebuttal_none",
      "alignment": [
        "context_error",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 27,
      "text": "Definition of the RBF kernel:",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          27
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 28,
      "text": "The summation is on the alpha parameter which can be a list of n values (or a single float value).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          27
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 29,
      "text": "The trainings were done with n=3 and alpha=[1. , 0.1 , 0.05].",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          27,
          28
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 30,
      "text": "Depending on the kernel and bandwidth definitions, we may link both as",
      "suffix": "\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          27,
          28
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 31,
      "text": "alpha = 1 / (2 x bandwidth**2).",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          27,
          28
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 32,
      "text": "* Calculation of reconstruction errors:",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          29
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 33,
      "text": "All scores are computed on NSGT magnitude spectrogram slices.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          29
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 34,
      "text": "No evaluation (except listening) is done on the time-domain waveforms.",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          29
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 35,
      "text": "The points marked with *** are highlighted as we would gratefully receive further remarks from your review.",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 36,
      "text": "How would you recommend making reference scores to the MMD/kNN evaluations ?",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_followup",
      "alignment": [
        "context_sentences",
        [
          19,
          20
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 37,
      "text": "How would you recommend comparing pairs of generated and ~ corresponding target domain samples ? (at the sample level)",
      "suffix": "\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_followup",
      "alignment": [
        "context_sentences",
        [
          21,
          22
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 38,
      "text": "Is the definition of the RBF kernel correct to you given that clarification (that should be added to the paper) ?",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_followup",
      "alignment": [
        "context_sentences",
        [
          27
        ]
      ],
      "details": {}
    },
    {
      "review_id": "HJgaRiJcnm",
      "rebuttal_id": "SyltxKsTpm",
      "sentence_index": 39,
      "text": "Thanks again for the interesting feedbacks !",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    }
  ]
}