\section*{\centering Reproducibility Summary}

% \textit{Template and style guide to \href{https://paperswithcode.com/rc2022}{ML Reproducibility Challenge 2022}. The following section of Reproducibility Summary is \textbf{mandatory}. This summary \textbf{must fit} in the first page, no exception will be allowed. When submitting your report in OpenReview, copy the entire summary and paste it in the abstract input field, where the sections must be separated with a blank line.
% }

\subsubsection*{Scope of Reproducibility}
In this work, we evaluate the reproducibility of the paper \textit{Label-Free Explainability for Unsupervised Models} by Crabbe and van der Schaar \cite{mainpaper}. Our goal is to reproduce the paper's four main claims in a label-free setting: (1) feature importance scores determine salient features of a model's input, (2) example importance scores determine salient training examples to explain a test example, (3) interpretability of saliency maps  is hard for disentangled VAEs, (4) distinct pretext tasks don’t have interchangeable representations.

% : (i) pretext task  representations are not interchangeable and (ii) labels in classification make example importance representations special compared to label-free approaches.

% State the main claim(s) of the original paper you are trying to reproduce (typically the main claim(s) of the paper).
% This is meant to place the work in context, and to tell a reader the objective of the reproduction.

\subsubsection*{Methodology}
The authors of the paper provide an implementation in PyTorch for their proposed techniques and experiments. We reuse and extend their code for our additional experiments. Our reproducibility study comes at a total computational cost of 110 GPU hours, using an NVIDIA Titan RTX. 

% Briefly describe what you did and which resources you used. For example, did you use author's code? Did you re-implement parts of the pipeline? You can use this space to list the hardware and total budget (e.g. GPU hours) for the experiments. 

\subsubsection{Results}
We reproduced the original paper's work through our experiments. We find that the main claims of the paper largely hold. We assess the robustness and generalizability of some of the claims, through our additional experiments. In that case, we find that one claim is not generalizable and another is not reproducible for the graph dataset.

% We also extend their third claim, showing that increasing the disentanglement factor of disentangled VAEs with attribution priors can make latent units focus on distinct features of the input.

% Start with your overall conclusion --- where did your results reproduce the original paper, and where did your results differ? Be specific and use precise language, e.g. "we reproduced the accuracy to within 1\% of reported value, which supports the paper's conclusion that it outperforms the baselines". Getting exactly the same number is in most cases infeasible, so you'll need to use your judgment to decide if your results support the original claim of the paper.

\subsubsection*{What was easy}
The original paper is well-structured. The code implementation is well-organized and with clear instructions on how to get started. This was helpful to understand the paper's work and begin experimenting with their proposed methods.
% Describe which parts of your reproduction study were easy. For example, was it easy to run the author's code, or easy to re-implement their method based on the description in the paper? The goal of this section is to summarize to a reader which parts of the original paper they could easily apply to their problem.

\subsubsection*{What was difficult}
We found it difficult to extrapolate some of the authors' proposed techniques to datasets other than those used by them.  Also, we were not able to reproduce the results for one of the experiments. We couldn't find the exact reason for it by running explorative experiments due to time and resource constraints. 

% Describe which parts of your reproduction study was difficult or took much more time than you expected. Perhaps the data was not available and you couldn't verify some experiments or the author's code was broken and had to be debugged first. Or, perhaps some experiments just take too much time/resources to run and you couldn't verify them. The purpose of this section is to indicate to the reader which parts of the original paper are either difficult to re-use, or require a significant amount of work and resources to verify.

\subsubsection*{Communication with original authors}
We reached out to the authors once about our queries regarding one experimental setup and to understand the assumptions and contexts of some sub-claims in the paper. We received a prompt response which satisfied most of our questions. 
% Briefly describe how many contacts you had with the original authors (if any).
