{
  "metadata": {
    "forum_id": "rkgwuiA9F7",
    "review_id": "Sker8vbtp7",
    "rebuttal_id": "SygaBRNB07",
    "title": "Cramer-Wold AutoEncoder",
    "reviewer": "AnonReviewer3",
    "rating": 5,
    "conference": "ICLR2019",
    "permalink": "https://openreview.net/forum?id=rkgwuiA9F7&noteId=SygaBRNB07",
    "annotator": "anno10"
  },
  "review_sentences": [
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 0,
      "text": "This paper proposes the Cramer-Wold autoencoder.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 1,
      "text": "The first contribution of the paper is to propose the Cramer-Wold distance between two distributions based on the Cramer-Wold Theorem.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 2,
      "text": "More specifically, in order to compute the Cramer-Wold distance, we first find the one dimensional projections of the distributions over random slices, and then compute the average L2 distances of the kernel density estimates of these projections over random slices.",
      "suffix": "",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 3,
      "text": "The second contribution of the paper is to develop a generative autoencoder which uses the Cramer-Wold distance to match the latent distribution of the data to the prior distribution.",
      "suffix": "\n\n",
      "review_action": "arg_structuring",
      "fine_review_action": "arg-structuring_summary",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 4,
      "text": "While I found the derivation of the Cramer-Wold distance interesting, the final form of this distance (Eq. 2), to me, looks very similar to the MMD with a particular kernel.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_originality",
      "polarity": "pol_negative"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 5,
      "text": "My main question is that: what is the main advantage of the Cramer-Wold distance to an MMD with a proper kernel?",
      "suffix": "\n\n",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 6,
      "text": "The paper points out that the main theoretical contribution is that in the case of the Gaussian distribution, the Cramer-Wold distance has a closed form.",
      "suffix": "",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 7,
      "text": "However, I believe this is also the case in the MMD, since if one of the distributions is Gaussian or analytically known, then E[k(x,x')] in the MMD can be analytically computed.",
      "suffix": "\n\n",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_originality",
      "polarity": "pol_negative"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 8,
      "text": "The paper further uses this closed form property of the Cramer-Wold distance to propose the Cramer-Wold autoencoder with Gaussian priors.",
      "suffix": "",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 9,
      "text": "My question here is that how is this method better than the standard VAE, where we also have an analytic form for the ELBO when the prior is Gaussian, an no sampling is required.",
      "suffix": "",
      "review_action": "arg_request",
      "fine_review_action": "arg-request_explanation",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 10,
      "text": "Indeed, in VAEs, the prior does not have to be Gaussian, and as long as the density of the prior can be evaluated, we can efficiently optimize the ELBO without sampling the prior; which I don't think is the case for the Cramer-Wold autoencoder.",
      "suffix": "",
      "review_action": "arg_evaluative",
      "fine_review_action": "none",
      "aspect": "asp_substance",
      "polarity": "pol_negative"
    },
    {
      "review_id": "Sker8vbtp7",
      "sentence_index": 11,
      "text": "I believe the main advantages of methods such as WAE is that they can impose priors that do not have exact analytic forms.",
      "suffix": "",
      "review_action": "arg_fact",
      "fine_review_action": "none",
      "aspect": "none",
      "polarity": "none"
    }
  ],
  "rebuttal_sentences": [
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 0,
      "text": "We thank the reviewer for insight into our paper.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 1,
      "text": "The reviewer found some points, where we were not clear enough. It is now the time to respond to them.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 2,
      "text": "1. The reviewer noticed,",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 3,
      "text": "that  \u201cin the MMD, since if one of the distributions is Gaussian or analytically known, then E[k(x,x')] in the MMD can be analytically computed\u201d.",
      "suffix": "\n\n",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          7
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 4,
      "text": "According to the best knowledge of the authors, the Cramer-Wold kernel (which defines the Cramer-Wold metric), except for the classical RBF kernel, is the only known characteristic kernel which has closed form for radial gaussians, and we believe the respective computations in other cases (like the inverse quadratic kernel used in WAE-MMD), would be highly nontrivial.",
      "suffix": "\n\n",
      "rebuttal_stance": "dispute",
      "rebuttal_action": "rebuttal_reject-criticism",
      "alignment": [
        "context_sentences",
        [
          7
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 5,
      "text": "2.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_in-rebuttal",
        null
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 6,
      "text": "The reviewer also points out, that the evidence lower bound ELBO, when used with a notiGaussian prior results in case of VAE in a generally analytic formula.",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_structuring",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 7,
      "text": "It was never the intention of the authors to sneak in that VAE cannot do it.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 8,
      "text": "Our primary goal was to define a method for training the Gaussian prior generative model using a different closed form formula for the distribution distance.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 9,
      "text": "At the same time VAE requires encoder to be Gaussian non-deterministic, and random decoder, which is not the case in CWAE (as well as in a WAE model, see Tolstikhin https://arxiv.org/pdf/1711.01558.pdf).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 10,
      "text": "The kernel used in the derivation is not a Gaussian kernel but has a closed form formula for a product of two Gaussians (see last equation in the current paper), itself not being Gaussian.",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 11,
      "text": "The Gaussian kernel itself is not well suited,",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 12,
      "text": "because it has an exponential rate of decay, and loses much information on the outliers (see also Bi\u0144kowski et al.,  https://arxiv.org/pdf/1801.01401.pdf, section 2.1).",
      "suffix": "",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 13,
      "text": "Our objective was to add a method alternative to the WAE method, but simpler in use (e.g. less parameters to be found).",
      "suffix": "\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_answer",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {}
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 14,
      "text": "We have extended the contribution part (in the introduction) and added Sections A and B to the Appendix, to make things clearer.",
      "suffix": "\n\n\n",
      "rebuttal_stance": "concur",
      "rebuttal_action": "rebuttal_done",
      "alignment": [
        "context_sentences",
        [
          9
        ]
      ],
      "details": {
        "request_out_of_scope": true
      }
    },
    {
      "review_id": "Sker8vbtp7",
      "rebuttal_id": "SygaBRNB07",
      "sentence_index": 15,
      "text": "Thank you again for your comments and suggestions. Have our responses and the changes we made to the manuscript addressed all of your concerns?",
      "suffix": "",
      "rebuttal_stance": "nonarg",
      "rebuttal_action": "rebuttal_social",
      "alignment": [
        "context_global",
        null
      ],
      "details": {}
    }
  ]
}