\section{RCA with a Known Graph}

In this section, we highlight the advantages of leveraging a DAG prior to failure by contrasting it with RCD, an RCA method that relies solely on CI tests. We begin by addressing the primary limitation of RCD, specifically its inability to utilize a DAG, which leads to a significant increase in the number of CI tests required. Then, we introduce the use of graphical structures as a potential solution in the case of a single root cause. For details on RCD, see Appendix~\ref{app:samplerun-rcd}. All proofs are provided in the Appendix~\ref{app:proofs}.

Firstly, RCD only learns the adjacencies between F-NODE and each observed variable as it operates. It conditions on every possible subset $\mathbf{S}$ of variables $\mathbf{V}$ for testing the conditional independence relations between each pair of variables i.e., $X, Y \in \mathbf{V}$ until it identifies a conditioning set that yields conditional independence, thereby excluding a potential node as the root cause under Assumption~\ref{assumption:faithfulness}. However, under Assumption~\ref{assumption:causal Markov assumption}, having access to a DAG $D$ allows us to conduct $n$ CI tests~\eg $(F\indep X|Pa_{D}(X))$ for each observed variable $X$ where $n$ is the number of observed variables. In other words, RCD performs at least as many CI tests as a naive approach using the DAG would require. Secondly, RCD may condition on a set of variables that is much larger than the actual parent set, resulting in unreliable CI test results in practice. In contrast, since our graphical structure captures ancestral relationships between nodes and there is only a single root cause variable, we argue that the root cause can be identified in fewer than $n$ tests. To support this, we present key results that allow for a systematic exploration of the causal structure, significantly reducing the number of required CI tests. 

For the case where there is only a single root cause, the following two lemmas indicate that certain CI relations can eliminate variables from being considered as root causes, under Assumption~\ref{assumption:causal Markov assumption} and~\ref{assumption:faithfulness}. The first lemma states that all ancestors of a variable $X$ can be excluded as the root cause if we observe that $F$ is conditionally independent of $X$ given some variables $\mathbf{Z}$. The second lemma asserts that all non-ancestors of $X$ can be excluded as the root cause if $F$ is conditionally dependent on $X$. Unlike RCD, which performs a series of CI tests and stops once a CI relation excludes a variable as the root cause, our approach systematically eliminates variables using two key results—Lemma~\ref{lem:ancestors_not_F} and Lemma~\ref{lem:descendants_cannot_be_targets}—without requiring tests on every variable. W


\begin{figure}[t]
    \centering 
    \footnotesize
   \begin{tikzpicture} [scale=0.4] \label{fig:working-example-for-lemmas}
     \node (2) at (0,0) {$X_{1}$};
      \node (3) at (3,0) {$X_{2}$};
    \node (4) at (6,0) {$X_{3}$};
    \node (5) at (2,3) {$X_{4}$};
    \node (6) at (-1,3) {$F$};
      \node (7) at (-3,0) {$X_{5}$};
    \path (2) edge (3);
    \path (3) edge (4);
    \path (5) edge (3);
    \path (7) edge (2);
    \path[red, line width=1.0] (6) edge (3);
    \end{tikzpicture}
    \caption{An example to show how Lemma \ref{lem:ancestors_not_F} and \ref{lem:descendants_cannot_be_targets} help identify the root cause with a few invariance tests given a causal graph, where $X_{1}$ is the root cause.} \label{fig:lemma_illustration}
\end{figure}


% Lemmas~\ref{lem:ancestors_not_F} and~\ref{lem:descendants_cannot_be_targets} state that ancestors of variables independent of $F$ and descendants of those dependent on $F$ cannot be the root cause. Appendix~\ref{app:example-show-working-of-lemma4} illustrates how these insights can reduce the number of tests below $n$ with a known DAG.

\begin{restatable}{lemma}{notancestors}
\label{lem:ancestors_not_F}
Given a DAG $D$,
    if $(F \indep X)_{P}$ for some $X\in \V$, then $A\not \in Ch_{D_{aug}}(F)$ for all $A \in An_{D}(X)$, where $P$ is any joint distribution between variables on $D_{aug}$.
\end{restatable}

\begin{restatable}{lemma}{notdecendants}
\label{lem:descendants_cannot_be_targets}
   Given a DAG $D$, if $(F \not \indep X)_{P}$ for some $X \in \V$, then then $Q \not \in Ch_{D_{aug}}(F)$ for all $Q \in NAn_{D}(X)$, where $P$ is any joint distribution between variables on $D_{aug}$.
\end{restatable}

% \begin{restatable}{lemma}{notancestors}
% \label{lem:ancestors_not_F}
%    For a causal graph $D$ and let $P$ be any joint distribution between variables on $D$, consider a PDAG $G=(\V,\Eb)$ such that $D\subseteq G\subseteq \varepsilon(D)$.  
%     If $(F \indep X)_{P}$ for some $X\in \V$, then $A\not \in Ch_{D_{aug}}(F)$ for all $A \in An_{G}(X)$, where $D_{aug}$ is $D$ with a $F$ node pointing to the root cause in $D$ and $P$ be any joint distribution between variables on $D_{aug}$.
% \end{restatable}


% \begin{restatable}{lemma}{notdecendants}
% \label{lem:descendants_cannot_be_targets}
%    For a causal graph $D$, consider a PDAG $G=(\V, \Eb)$ such that $D\subseteq G\subseteq \varepsilon(D)$. If $(F \not \indep X)_{P}$ for some $X \in \V$, then then $Q \not \in Ch_{D_{aug}}(F)$ for all $Q \in NAn_{G}(X)$, where $D_{aug}$ is $D$ with a $F$ node pointing to the root cause in $D$ and $P$ be any joint distribution between variables on $D_{aug}$.
% \end{restatable}

% line graph: X1 -> X2 -> X3
We will use Figure \ref{fig:lemma_illustration} to illustrate how Lemmas \ref{lem:ancestors_not_F} and \ref{lem:descendants_cannot_be_targets} may help identify the root cause, which is $X_{1}$ in this case, with less than $n$ invariance tests. We can start by arbitrarily picking a variable for testing conditional independence with $F$. Suppose we select $X_{2}$ to test whether $(F\indep X_{2})_{P}$. By Assumption \ref{assumption:faithfulness}, we will observe $(F\not \indep X_{2})_{P}$. Then, Lemma \ref{lem:descendants_cannot_be_targets} says that $X_{3}$ cannot be the root cause. Suppose we pick $X_{1}$ to test for conditional independence, then we will observe $(F\indep X_{1})_{P}$. Then, by Lemma \ref{lem:ancestors_not_F}, we know that $X_{5}$ cannot be the root cause either. Then, we are only left with $X_{4}$ to test for conditional independence. This results in a total of $3$ marginal independence tests, which is less than $n=5$.  To further illustrate the utility of these two key results, we show that there is a one-to-one correspondence between the use of marginal invariance tests for RCA with a known DAG and the problem known as \textit{Interactive Graph Search} (IGS)~\citep{tao2019interactive} which guarantees identification of the root cause in fewer than $n$ tests. For the sake of clarity, we provide the formal problem formulation of IGS. 

\begin{tcolorbox}[colback=gray!20, colframe=white, boxrule=0mm, boxsep=2mm, left=1mm, right=1mm, top=1mm, bottom=1mm]
\textbf{Interactive Graph Search (IGS)}\\
INSTANCE: A DAG $D=(\V,\Eb)$ that has a single root node, an adversary chooses arbitrarily a target node $R\in \V$. There is an oracle that returns a boolean answer to the given query: yes, if there is a directed path from $X$ to $R$ and no otherwise for any $X \in \V$.\\
QUESTION: \textit{What is the minimum number of queries to ask in order to identify $R$ in $D$?}
\end{tcolorbox}

\begin{restatable}
{lemma}{reduction}
\label{lem:reduction}   Consider a DAG $D=(\V, \mathbf{E})$ with a single sink node and $D'$ be a DAG by reversing every edge direction in $\E$, let $Q(X)$ be a query to the oracle on whether some $X \in \V$ has a directed path to an unknown target node $R \in \V$. 
\begin{equation}
    Q(X) = \text{yes} \Leftrightarrow (F \not \indep X)_{P} 
\end{equation} Therefore, if $Q(X) =$ yes, then $X \in An_{D'}(R)$. If $Q(X)=$ no, then $X \in NA_{D'}(R)$.
\end{restatable}

The significance of Lemma~\ref{lem:reduction} is that a solution to IGS is now a solution to RCA using marginal invariance tests, given a known DAG. For DAGs that do not have a single sink node, we can simply add a dummy node as a child of all the sink nodes. Hence, the following theorem is an immediate consequence of Theorem $1$ (see Appendix~\ref{app:shang_theorem}) proven by~\citet{shangqi2023partial}).
\begin{restatable}
{theorem}{reductionoptimal}
 Given a DAG $D$ with a single sink node, any algorithm the only uses marginal invariance tests must perform $\Omega(\log_{2}n + d\log_{1+d}n)$ many tests to find the single root cause in the worst case, where $d$ is the maximum in-degree of $D$ and $n$ is the number of nodes. There exists an algorithm that finds the root cause with $\mathcal{O}(\log_{2}n + d\log_{1+d}n)$ marginal invariance tests.
\end{restatable}
% \begin{theorem}
%     Given a causal graph $D$ that has a single sink node,  any algorithm the solely uses CI tests must perform  $\Omega(\log_{2}n + d\log_{1+d}n)$ marginal invariance tests to identify the single root cause in the worst case, where $d$ is the maximum in-degree of $D$ and $n$ is the number of vertices. There exists an algorithm that localizes the root cause with $\mathcal{O}(\log_{2}n + d\log_{1+d}n)$ marginal invariance tests.
% \end{theorem}
% \begin{proof}
%     This follows from Lemma \ref{lem:reduction} and Theorem 1 in \citep{shangqi2023partial}, which says that any algorithm must ask $\Omega(\log_{2}n + d\log_{1+d}n)$ queries to identify the target node selected by an adversary in a DAG $D'$ with a single root node for the problem of IGS, where $d$ is the maximum out-degree in $D'$ and there is an algorithm that can find the target node in $\mathcal{O}(\log_{2}n + d\log_{1+d}n)$ number of queries.
% \end{proof}

\citet{shangqi2023partial} provides an optimal algorithm that bounds the worst-case number of queries to $\mathcal{O}(\log_{2}n + d\log_{1+d}n)$ for IGS. Due to Lemma~\ref{lem:reduction}, this algorithm can be modified for RCA with a single root cause using marginal invariance tests. Hence, we have shown that we need fewer than $n$ tests and that marginal invariance tests alone are sufficient for identifying the root cause given a DAG. We provide the pseudocode of the modified IGS algorithm through Algorithm~\ref{alg:modified-dfs-tree} in the Appendix.

% consider the simple line graph $X_1 \rightarrow X_2 \rightarrow X_3$ and consider $X_1$ to be the root cause. By applying Lemma \ref{lem:descendants_cannot_be_targets}, we can identify $X_1$ as the root cause after testing $(F \indep X_{1})_{P}$. This is because $X_2$ and $X_3$ are descendants of $X_1$, and the lemma asserts that the descendants of a node that is dependent on \fnode cannot be potential targets. Consequently, we can eliminate $X_2$ and $X_3$ from the list of potential targets without running any additional CI tests for them. Similarly, consider the same line graph, but this time assume that $X_3$ is the root cause. According to Lemma \ref{lem:ancestors_not_F}, after testing $(F \indep X_{2})_{P}$, we can eliminate $X_1$ and $X_2$ from the list of potential targets. This is because the ancestors of a node that is conditionally independent of \fnode cannot be the root cause. These lemmas allow us to streamline the RCA process by reducing the number of necessary CI tests, thereby increasing efficiency in pinpointing the root cause.

% Using the two key results along with our specific graph structure, we can reduce the number of CI tests from $n$ to exactly 1. However, this reduction is only possible if we select the right node to test. To systematically identify the optimal nodes for testing, we propose a specialized binary search algorithm tailored for directed line graphs.. Given a line graph, \name selects the middle node $X$ and checks if $(F \indep X)_{P}$. If the test shows independence, then using Lemma~\ref{lem:ancestors_not_F}, \name can ignore all the ancestors of $X$ as they cannot be the target. Conversely, if the CI test indicates dependence between $F$ and $X$, then \name applies~\ref{lem:descendants_cannot_be_targets} and remove all descendants of $F$ from the list of potential root causes. This binary search approach for line graphs result in $\log n$ tests as half of the nodes are eliminated at every iteration. The correctness of this algorithm directly follows from Lemma~\ref{lem:ancestors_not_F} and Lemma~\ref{lem:descendants_cannot_be_targets}.


% \begin{algorithm}[H] 
% \small
%     \caption{Revised Heavy-path DFS Tree for RCA} \label{alg:known graph}
%     \begin{algorithmic}[1]
%         \INPUT Interventional data $\mathcal{D}$, a DAG $G=(\V, \Eb)$
%         \OUTPUT Top one root cause $R$
%         \STATE If $G$ has more than one root node, add a node $Q$ as a parent to all root nodes in $\V$.
%         \STATE $T \leftarrow$ \textbf{CONSTRUCT-HEAVY-PATH-DFS-TREE}($G$)
%         \STATE If $T$ has more than one sink node, add a node $S$ as a child to all sink nodes in $T$. 
%         \STATE $\hat{R} \leftarrow$ the root $Q$
%         \REPEAT
%         \STATE Perform a binary search on the leftmost $\hat{R}$-to-leaf path of $T$ to find the last node $R$ that gives $(F \dep R)_{P}$
%         \IF{$R$ does not have a parent or its parent is independent of $F$}
%             \STATE \textbf{return} $R$ as the root cause
%         \ENDIF
%         \UNTIL{Root cause $R$ has been found}
%     \end{algorithmic}
% \end{algorithm}



% Using the two key results along with our specific graph structure, we can reduce the number of CI tests from $n$ to exactly 1. However, this reduction is only possible if we select the right node to test. To devise a systematic approach for identifying the right nodes to test, we turn to a well-studied problem in graph search literature. Specifically, we refer to a problem known as Interactive Graph Search (IGS)~\cite{parameswaran2011human, tao2019interactive, lu2022optimal, shangqi2023partial}, where the objective is to find a target in a given DAG by probing a minimum number of nodes. A probe on a node reveals whether the target lies among its descendants. We argue that identifying the optimal set of nodes to test in our problem can be reduced to an instance of the IGS problem. By leveraging our key results, we establish a connection between IGS and our task of selecting the appropriate nodes to test in order to reduce the number of CI tests. This approach allows us to systematically explore the causal structure and optimize the RCA process by reducing unnecessary tests and focusing on the most informative nodes.

% The key idea that enables us to use IGS is that our two key results, Lemma \ref{lem:ancestors_not_F} and Lemma \ref{lem:descendants_cannot_be_targets}, provide a mechanism analogous to a probe in IGS. After performing a marginal CI test on a node $X$, we can accurately determine whether the root cause lies among the ancestors or non-ancestors of $X$. The only difference is that, in IGS, a positive probe indicates that the target is among the descendants of the probed node. In our case, however, a positive CI test (where the node is dependent on \fnode) indicates that the target is among the ancestors of that node. To reduce our problem of finding the root cause in a given DAG to an IGS problem, we simply reverse the direction of all edges in the DAG and apply IGS on this modified graph. By doing so, the IGS mechanism can help us systematically identify the optimal nodes to test, thereby reducing the number of CI tests needed to find the root cause.

% Lemma~\ref{lem:igs_correctness} and its proof in the appendix states that using IGS is sound for finding the root cause in a DAG.

% \begin{restatable}{lemma}{igscorrectness}
% \label{lem:igs_correctness}
%  For a causal DAG $D=(\V,\E)$, construct a DAG $D_{aug}$ with all the edges reversed in $D$. Given access to a perfect conditional independence oracle, and under causal sufficiency, and the faithfulness assumption, IGS on $D_{aug}$ will find the root causes.
% \end{restatable}

% Using IGS to identify the target in a DAG provides a systematic solution while significantly reducing the number of CI tests that need to be executed. Specifically, with \name, we previously performed up to $n$ CI tests. However, by applying IGS, we can reduce this number to $(d - 1) \log_{d} n$, where $d$ is the maximum out-degree of a node in the DAG.
