\section{Bayesian Testing Procedure} \label{sec:bayesiantest}

We start with proposing a Bayesian test for IV inequalities using finite data. Although  \citet{Ramsahai08} proposed a frequentist test and derived the distribution of the likelihood ratio, providing a means to obtain a p-value for the case of a binary treatment, it is unclear what the test would be for our case where the ``treatment'' is not binary. In addition, \citet{WangRR17} provide a frequentist test that involves multiple one-sided independence tests. In contrast, the Bayesian test has straightforward extensions to other hypotheses that are expressed in terms of the observational distribution, including bounds on unidentifiable causal queries.

Consider discrete random variables $X,Y,Z$ where $|\cX|=n, |\cY|=2, |\cZ|=2$. The observational distribution $P(X,Y,Z)$ lies in the $4n-1$ dimensional simplex, denoted by $\Delta$, which is a subset of $\RR^{4n}$. We consider a Bayesian model selection procedure where 
\begin{align*}
    \MM_0 &= \left \lbrace \theta \in \Delta : \theta \text{ satisfies the IV inequalities} \right \rbrace, \\
    \MM_1 &= \left \lbrace \theta \in \Delta : \theta \text{ does not satisfy the IV inequalities} \right \rbrace.
\end{align*}
Note that $\MM_0 \,\dot{\cup}\, \MM_1 = \Delta$. Given a finite dataset of $(X,Y,Z)$ tuples, denoted by $R_1, R_2, \cdots, R_m$, and choices of prior distributions for the models, $\pi\Paren{\theta\mid\MM_0},\pi\Paren{\theta\mid\MM_1}$, we report a confidence interval for the posterior probability of satisfying the IV inequalities, i.e., $P\Paren{\MM_0\mid R_1,R_2, \cdots, R_m} = \int_{\theta \in \MM_0} P\Paren{\theta \mid R_1,R_2, \cdots, R_m} d\theta$. Given the posterior density $P\Paren{\theta \mid R_1,R_2, \cdots, R_m}$, we estimate the posterior probability of $\MM_0$ by IID sampling from the posterior density $n$ times and counting how often the sample satisfies the IV inequalities, which we denote by $N$. Since $N$ is a binomial random variable with parameters $n$ and $P\Paren{\MM_0\mid R_1,R_2, \cdots ,R_m}$, a confidence interval on $P\Paren{\MM_0 \mid R_1,R_2, \cdots R_m}$ is readily obtained by the Clopper-Pearson method \citep{ClopperPearson34}.

\noindent\textbf{Results on Berkeley admission data: }We use the \texttt{UCBAdmissions} \citep{UCBAdmissions} dataset from $\texttt{R}$ that contains counts for each sex-department-admissions outcome tuple for the $6$ largest departments. Therefore, $|\cX| = 6$, $|\cY|=2, |\cZ|=2$. Since the data satisfies the positivity for sex, we can use the Bayesian model selection procedure. For parameters \ifdefined\SINGLE $$\theta = \Paren{ P(d,a,s): s \in \cX_S, d \in \cX_D, a \in \cX_A},$$ \else $\theta = \Paren{ P(d,a,s): s \in \cX_S, d \in \cX_D, a \in \cX_A}$,\fi we choose a flat Dirichlet prior over $\Delta$ giving us $\pi(\theta | \MM_i)= c_i \text{Dir}\Paren{1,1,\cdots,1}\bm{1}\left[ \theta \in \MM_i \right]$ where $c_i$ is a normalizing constant. The counts from the data are used to obtain the posterior, $P(\theta \mid R_1, R_2, \cdots, R_m)$ which is also a truncated Dirichlet distribution. Using $n=10^6$ samples, we observe no violations of the IV inequalities. Therefore, the confidence interval for the posterior probability of the Berkeley data satisfying the IV inequalities is $\left[1-3.69\times 10^{-6},1\right]$. As mentioned in Section~\ref{subsec:graph_iv} satisfying the IV inequalities implies that fairness is undecidable. In Section~\ref{app:bayes}, we carry out a sensitivity analysis by varying the chosen prior. We also report results on a different dataset from \citet{Bol23} that investigates sex-based discrimination in awarding cum-laude distinctions to graduate students.

In addition to the Bayesian test, the maximum likelihood (ML) estimator satisfies the IV inequalities, implying that there isn't enough evidence to reject the null hypothesis when doing a likelihood ratio test. An implementation of \citet{WangRR17} for the Berkeley dataset also does not reject the null hypothesis (see Section~\ref{app:bayes} for details). The code can be found at {\small\texttt{https://github.com/SourbhBh/BerkeleyCode}}.

%However, a procedure to compute the p-value in case the ML estimator does not satisfy the IV inequalities is unknown \citep{Ramsahai08}. 
