% \documentclass[twoside]{article}

\documentclass[accepted]{uai2022}
\usepackage[american]{babel}
%\usepackage[accepted]{aistats2022}
% If you set papersize explicitly, activate the following three lines:
% \special{papersize = 8.5in, 11in}
% \setlength{\pdfpageheight}{11in}
% \setlength{\pdfpagewidth}{8.5in}

% If you use natbib package, activate the following three lines:
%\usepackage[round]{natbib}
%\renewcommand{\bibname}{References}
%\renewcommand{\bibsection}{\subsubsection*{\bibname}}

% If you use BibTeX in apalike style, activate the following line:
% \bibliographystyle{apalike}
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage[hyphens]{url} 
\usepackage{graphicx}
\usepackage{amsmath,amssymb,amsthm}
\usepackage{multirow}
\usepackage[tableposition=above]{caption}
\captionsetup[table]{skip=10pt}
\usepackage{subcaption}


\newtheorem{theorem}{Theorem}

\captionsetup[subfigure]{labelfont=bf,textfont=normalfont,singlelinecheck=off,justification=centering}

\title{CounteRGAN: Generating Counterfactuals for Real-Time Recourse and Interpretability using Residual GANs (Supplementary material)}

% The standard author block has changed for UAI 2022 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author[1]{\href{mailto:<nemird@amazon.com>?Subject=Your UAI 2022 paper}{Daniel~Nemirovsky}{}}
\author[2]{Nicolas~Thiebaut}
\author[3]{Ye~Xu}
\author[3]{Abhishek~Gupta}
\affil[1]{
  Amazon\\
  Seattle, WA, U.S.A.\\
  nemird@amazon.com
}
\affil[2]{
  Hired\\
  New York, NY, U.S.A.\\
  nicolas.thiebaut@hired.com
}
\affil[3]{
    Meta\\
    Menlo Park, CA, U.S.A.\\
    \{yexu, abigupta\}@fb.com
 }
 
\begin{document}
\onecolumn
\maketitle

\section{Additional experiment: Pima Indians Diabetes dataset}

Following the experiments in \cite{Wachter2017-jr}, we utilize the Pima Indians Diabetes dataset (\cite{diabetes_dataset}). It is composed of low dimensional tabular data and helps to validate the CounteRGAN's versatility and its applicability to diverse use cases. The dataset contains 8 features describing the relevant characteristics of patients useful for predicting diabetes. The target label is positive if the patient has diabetes (268 examples) and negative otherwise (500 examples). We use stratified (label balanced) sampling with 80\% of the dataset being assigned to the train set and the remaining 20\% for the test set. The classifier is the same as the neural network architecture used in \cite{Wachter2017-jr} and achieves an accuracy of 74.68\% on the test set.
% \footnote{Note that this is relatively low compared to the 65.10 \% accuracy achieved using a random classifier as a baseline.}. 
% More details on the model architectures and parameters used for the counterfactual search methods can be found in the supplementary material.


\renewcommand{\arraystretch}{1.6}
\begin{table*}[hbt!]
\scriptsize
\centering
\begin{tabular}{l||c|c|c|c||c|c|c}
 \multirow{2}{*}{} &           \multicolumn{4}{c||}{White-box classifier} & \multicolumn{3}{c}{Black-box classifier} \\ 
&                    RGD &                        CSGP &                          GAN &                  CounteRGAN &    RGD & CSGP & CounteRGAN \\
\hline
$\uparrow$ Prediction gain &      0.15 $\pm$ 0.01 &           0.13 $\pm$ 0.02 &            0.15 $\pm$ 0.03 &  \textbf{0.33 $\pm$ 0.04} &  \textbf{0.17 $\pm$ 0.00} & 0.13 $\pm$ 0.00 & \textbf{0.16 $\pm$ 0.02} \\
$\downarrow$  Realism      &      2.20 $\pm$ 0.24 &           2.03 $\pm$ 0.11 &            3.33 $\pm$ 0.11 &  \textbf{1.79 $\pm$ 0.11} &  2.22 $\pm$ 0.01  & \textbf{1.98 $\pm$ 0.01} & 2.13 $\pm$ 0.12 \\
$\downarrow$  Actionability  &      1.64 $\pm$ 0.20 &  \textbf{1.14 $\pm$ 0.19} &            9.46 $\pm$ 0.53 &           6.91 $\pm$ 0.43 & 1.75 $\pm$ 0.02  & \textbf{1.29 $\pm$ 0.02} & 2.97 $\pm$ 0.12  \\
$\downarrow$  Latency (ms) &  1,195.91 $\pm$ 5.65 &       3,211.67 $\pm$ 11.65 &  1.68 $\pm$ 0.06 &          \textbf{1.51 $\pm$ 0.03} & 2,525.99 $\pm$ 1.23  & 15,921 $\pm$ 23.66 &  \textbf{1.82 $\pm$ 0.12} \\
$\downarrow$  Batch latency (s) & 204.58 & 483.88 & 0.26 & \textbf{0.23} & 453.45  & 2,228.23 & \textbf{0.32}
\end{tabular}
\medskip
\caption{Diabetes test data results (mean and 95\% confidence interval). The arrows indicate whether larger $\uparrow$ or lower $\downarrow$ values are better, and the best results are in bold. The realism metric typically ranges from 1.84  (mean reconstruction error on the test set) to 2.44 (reconstruction error on random Gaussian noise). Computations are performed using the entire test set (154 samples).}
\label{table:diabetes_metrics}
\end{table*}

For this experiment we introduce the important concept of \textit{mutable} and \textit{immutable features}. For most practical applications of counterfactual search, certain features may be hard or impossible to change and can be considered immutable. Though features typically vary in their degree of mutability, for the purposes of this experiment we consider features as either mutable or immutable. For the Pima Indians Diabetes dataset, we consider \textit{Pregnancies}, \textit{Age}, and \textit{Diabetes Pedigree Function} features to be immutable. We use \textit{Glucose}, \textit{Insulin}, \textit{Body Mass Index}, \textit{Tricept Skin Fold Thickness}, and \textit{Blood Pressure} as mutable features. In practice, we apply counterfactual search with no modifications, then simply cancel the perturbations applied to immutable features.

Table \ref{table:diabetes_metrics} summarizes our findings for this experiment. On this dataset, all methods appear equally capable of improving classifier prediction gain. The CounteRGAN generates more realistic instances, and the CSGP outputs the sparsest counterfactuals. Even on this low-dimensional dataset, the CounteRGAN is able to meet or exceed the evaluation metrics of counterfactuals produced by existing methods while heavily outperforming them in terms of latency. This includes $>$1,000x to $>$2,000x improvements for individual counterfactuals on white-box and black-box models respectively and from 3 to 4 orders of magnitude for batch generation of all counterfactuals.  

The evaluation results validate that the proposed CounteRGAN method is capable of overcoming the main limitations of existing methods, namely the lack of realism and high latency. It also provides similar or better prediction gain and actionability on high dimensional images and a low-dimensional tabular dataset. The impressive latency improvements are pivotal with regard to real-time applicability and scalability. This is due to the generator only needing a forward-pass through the neural network as opposed to performing a new counterfactual search for every data point, as required by existing methods. 


\section{Proof of Theorem 1}

\begin{theorem} %[Convergence of CounteRGAN-wt to $p_{C_t}$] 
If the discriminator is systematically allowed to reach its optimum, and the generator has sufficient capacity, then the minimax optimization of the value function 
\begin{equation}
\label{eq_countergan_nondiff}
    \mathcal{V}_{\mathrm{CounteRGAN-wt}}(D, G)=\frac{\sum_i C_t(x_i) \log D(x_i)}{\sum_i C_t(x_i)} 
+ \frac{1}{N} \sum_i \log \left(1-D(x_i+G(x_i))\right),
\end{equation}
converges to the Nash equilibrium. The full generator's output distribution $p_{g_+}$ converges to a distribution $p_{C_t}$ defined by

\begin{equation}
p_{C_t}(x) = \mathcal N_t \; C_t(x) \; p_\mathrm{data}(x),
\end{equation}
\newline
\noindent where $N_t$ is a normalization constant.\footnote{Explicitly, $\mathcal N_t= \left(\int C_t(x) \; p_\mathrm{data}(x) \mathrm{d}x\right)^{-1}$ but it doesn't need to be computed for our purpose.} 
\end{theorem}


\begin{proof}We first introduce the full generator output function $G_+(x) = x + G(x)$, and note that the value function defined by equation \ref{eq_countergan_nondiff} can be written as 
\begin{equation}
\label{eq_countergan_nondiff_expectations}
\begin{aligned}
    \mathcal{V}_{\mathrm{CounteRGAN-bb}}(D, G)=\mathbb{E}_{x \sim p_{C_t}} \log D(x) + \mathbb{E}_{x \sim p_{g_+}} \log \left(1-D(x)\right),
\end{aligned}
\end{equation}
since the first term on the r.h.s. of Equation \ref{eq_countergan_nondiff} is a weighted sampling estimate of $\mathbb{E}_{x \sim p_{C_t}} \log D(x)$, and for the second term, the equality $\mathbb{E}_{x \sim p_{g_+}} \log \left(1-D(x)\right)=\mathbb{E}_{x \sim p_{\mathrm{data}}} \log \left(1-D(G_+(x))\right)
$ is a consequence of the Radon–Nikodym theorem. 

From the expression of the value function in equation \ref{eq_countergan_nondiff_expectations}, Proposition 1 of \cite{Goodfellow2014-wf} implies that for any generator $G$ the optimal discriminator is 
\begin{equation}
D^*(x) = \frac{p_{C_t}(x)}{p_{g_+}(x)+p_{C_t}(x)}.
\end{equation}

The value function for an ideal discriminator thus reads:
\begin{equation}
    \mathcal{V}^*(G) = \mathcal{V}(D^*, G)= \mathbb E_{x\sim p_{C_t}} \log \frac{p_{C_t}(x)}{p_{g_+}(x)+p_{C_t}(x)}
     + \mathbb E_{x\sim p_{g_+}} \log \frac{p_{g_+}(x)}{p_{g_+}(x)+p_{C_t}(x)}.
\end{equation}

To find the distribution $p_{g_+}^*$ that minimizes $\mathcal{V}^*$ under the probability normalization constraint, $\int p_{g_+}(x) \mathrm{d}x = 1 $, we introduce a Lagrange multiplier $\mu$. We then compute the functional derivative of $\mathcal{V}^*$ with respect to $p_{g_+}$ using the shortened notation for $p = p_{C_t}(x)$ and $q = p_{g_+}(x)$ in the following equation
\begin{equation}
    \frac{\delta \mathcal{V}^*}{\delta q} = \frac{\partial}{\partial q}\left[p\log\left(\frac{p}{p+q}\right) + q\log\left(\frac{q}{p+q} \right) + \mu q\right] = \log\left(\frac{q}{p+q}\right) +\mu.
\end{equation}

% \begin{equation}
% \begin{aligned}
%     \frac{\delta \mathcal{V}^*}{\delta q} & = \frac{\partial}{\partial q}\left[p\log\left(\frac{p}{p+q}\right) + q\log\left(\frac{q}{p+q} \right) + \mu q\right] \\
%     & = \log\left(\frac{q}{p+q}\right) +\mu.
% \end{aligned}
% \end{equation}

The optimum of $\mathcal{V}^*$ is attained for 
\begin{equation}
    \frac{\delta V}{\delta p_{g_+}^*}(x) = 0 \quad \Longleftrightarrow \quad p_{g_+}^*(x) = \frac{p_{C_t}(x)}{\exp(\mu) - 1},
\end{equation}
from which the normalization constraint leads to
\begin{equation}
\int \frac{p_{C_t}(x)}{\exp(\mu)- 1}\mathrm d x=1 \quad \Longleftrightarrow \quad \exp(\mu)=2,
\end{equation}
such that 
\begin{equation}
p_{g_+}^*(x) = p_{C_t}(x)
\end{equation}
for all $x$. Hence $\mathcal V^*$ has a unique optimum\footnote{The optimum is a minimum here since $\mathcal V^*$ is a convex functional of $p_{g_+}$, as can be seen from the form of the second functional derivative $\frac{\delta^2 V}{(\delta p_{g_+}^*)^2}(x) = \frac{p_{C_t}(x)}{p_{g_+}(x)(p_{g_+}(x)+p_{C_t}(x))}$, which is always positive.} that is reached when 
\begin{equation}
 p_{g_+}^* = p_{C_t}.
\end{equation}

The fact that $p_{g_+}$ converges to the optimum when using the alternating gradient updates follows from Proposition 2 in \cite{Goodfellow2014-wf}. \end{proof}



\section{Synthetic dataset example \label{sec:appendix:toy_example}}


\begin{figure*}[htb!]
    \small
     \centering
     \begin{subfigure}[t]{0.155\textwidth}
        %  \centering
  \includegraphics[width=1.0\columnwidth]{figures/toy_dataset.png} 
         \caption{Original distribution of data points.}
         \label{fig:toy_dataset_distr}
     \end{subfigure}
     \begin{subfigure}[t]{0.155\textwidth}
        %  \centering
  \includegraphics[width=1.0\columnwidth]{figures/toy_dataset_classifier.png} 
         \caption{Decision boundary of trained classifier.}
         \label{fig:toy_dataset_clf}
     \end{subfigure}
    %  \hfill
     \begin{subfigure}[t]{0.155\textwidth}
        %  \centering
  \includegraphics[width=1.0\columnwidth]{figures/toy_dataset_classifier_and_samples.png} 
         \caption{Data points for counterfactuals search.}
         \label{fig:toy_dataset_samples}
     \end{subfigure}
     \begin{subfigure}[t]{0.155\textwidth}
        %  \centering
  \includegraphics[width=1.0\columnwidth]{figures/toy_dataset_vanilla.png} 
         \caption{Regularized gradient descent (RGD).}
         \label{fig:toy_dataset_vgd}
     \end{subfigure}
    %  \hfill
     \begin{subfigure}[t]{0.16\textwidth}
        %  \centering
\includegraphics[width=0.96\columnwidth]{figures/toy_dataset_regular_gan.png}
         \caption{Standard GAN.}
         \label{fig:toy_dataset_regular_gan}
     \end{subfigure}
    %  \hfill
     \begin{subfigure}[t]{0.16\textwidth}
        %  \centering
\includegraphics[width=0.96\columnwidth]{figures/toy_dataset_countergan.png}
         \caption{CounterGAN.}
         \label{fig:toy_dataset_countergan}
     \end{subfigure}
        \caption{Comparing how three different counterfactual search techniques are able to achieve their objectives while producing significantly different counterfactuals on a synthetic and binary class dataset.}
        \label{fig:toy_dataset}
\end{figure*}


Figure \ref{fig:toy_dataset} provides an example of counterfactual search using a synthetic dataset meant to illustrate the challenges faced by counterfactual generation methods. The data points shown in (a) can be interpreted as the known populations from two different societies (red/blue). An ML classifier has been trained to predict the type of society a person belongs to based on their weight ($x$-axis) and height ($y$-axis). The solid white line in (b) represents the classifier's decision boundary such that all predictions for points falling within the red shaded region are classified as persons belonging to the red society and vice-versa. The five selected orange points in (c) represent persons from the red society we seek to provide counterfactuals for. These counterfactuals should provide meaningful recourse regarding how to turn themselves into realistic looking persons of the blue society, as predicted by the classifier. The counterfactuals generated by an existing method (d) produce the correct classification result (blue) but the suggested changes would mean that the transformed individuals would not look like the rest of the known populace of the blue society (lack of realism). Using a standard GAN, the counterfactuals always result in the same or similar looking persons of the blue society. While these results are more realistic than those obtained with the previous method, the suggested changes may be harder to apply to some original persons than others (i.e., lower sparsity) and hence less actionable. The proposed CounteRGAN method (f) results in counterfactuals that are of the desired classification (blue) and are most realistic and actionable than those obtained with previous methods. Red society members seeking to imperceptibly infiltrate the blue society would benefit the most from the meaningful recourse provided by this method.

\section{Code}

The corresponding code to reproduce all the results and methods will be available by the date of publication.

\bibliographystyle{plainnat}
\bibliography{countergan.bib}

\end{document}
