\rebuttal{
\section{Membership Inference attacks in centralized setup without the knowledge of training samples}
\label{sec:shadow_training}
In the setup of \sectionref{subsec:centralized_result}, we assumed that the adversary has access to some training samples to perform membership inference attacks. However, such an assumption may be too restrictive. Here, we discuss white-box membership inference attacks and the attacker has access to some samples from the training distribution only instead of training samples. The attacker does not know if these samples were part of training.

Since the attacker does not have access to the samples used to train the model, attack classifier cannot be trained. To circumvent this limitation, we use the idea of shadow training from~\citet{nasr2018machine}. Briefly, the attacker trains new models with the same architecture and training hyperparameters using the samples available from the training distribution. These newly trained models, called shadow models, are expected to imitate or shadow the trained model's behavior --- for example, similar overfitting behavior, similar training performance, etc. Therefore, the attacker may train the attack classifier using the shadow models and samples used to train them and expect to transfer to the trained models he intends to attack.




\subsection{Setup}
For this section, we use the train, test and validation split described in \sectionref{subsec:centralized_training_details}. We consider that the attacker has access to a trained model and some samples from the training distribution, which may or may not overlap with the samples used to train the model being attacked. The attacker intends to identify if some data sample was used to train the model.

The attacker is trying to attack the same models that are described in \appendixref{sec:appendix_training_data_details}. These models are trained on the full training set.
To simulate the attacks with access to only the training distribution but not training samples, we consider the scenario where the attacker has access to 5000 random samples from the training distribution. For this, we pick 5000 random samples from the original training set of size 7312.
The attacker is trying to determine the membership of samples from the train set, which differ from these 5000 samples.
Due to limited data, the data used to train the shadow models overlaps with the data used to train the original model. A more difficult scenario will be if these datasets do not overlap at all.

\subsection{Result}
To report the membership inference attack performance, we created a test dataset of 1500 samples from the full train set (different from 5000 samples that the attacker already has) and 1500 samples from the unseen set to evaluate the membership inference attack accuracy.
We trained a single shadow model with 5000 samples that are available to the attacker. The attack classifier is trained to attack the shadow models similar to earlier experiments using prediction, label, and gradient of \texttt{conv6} and \texttt{output} layers from the shadow model as the features. We extract these features from the trained model and classify them with the attack classifier to infer the memberships. The results are summarized in \tableref{tab:shadow_training}. The `Test' column shows the result of performing  a membership inference attack on the trained model, which is what we are interested in. We also report the attack accuracies on the validation set derived from the shadow model's training set in the `Validation' column. We observe that even without access to training samples, the membership inference attacks are feasible, albeit with slightly lower accuracy than the case in which the adversary has access to some of the training samples.


\begin{table}[tbp]
    \centering
    \begin{tabular}{l c c}
    \toprule
    Model & Test & Validation \\
    \cmidrule(r){1-1} \cmidrule(lr){2-2} \cmidrule(lr){3-3}
    \texttt{3D-CNN} &$71.74\pm 1.82$ & $75.22 \pm 0.22$  \\
    \texttt{2D-slice-mean} &$74.39 \pm 2.14$& $85.46\pm 0.24$ \\
    \bottomrule
    \end{tabular}
    \caption{Membership inference attacks without the knowledge of training samples. The test performance results from performing membership inference on the trained model using attack models trained on information from the shadow model. The validation performance is the attack classifier's performance on the validation set derived from the shadow models' training set.}
    \label{tab:shadow_training}
\end{table}
}
