\section{Equivalence Classes and Their Implications}
\label{sec:pag_notation}

One common graphical abstraction to represent sets of causal diagrams with the same $d$-separation and non-ancestral relations are so called Maximal Ancestral Graphs (MAGs). ``Ancestral'' due to the fact that MAGs does not contain directed cycles (directed paths that start and end at the same node) or almost directed cycles (directed paths $X \rightarrow \cdots \rightarrow Y$ such that $X \dashleftarrow\dashrightarrow Y$), and ``maximal'' due to the fact that every pair of nonadjacent nodes $\{X, Y\}$, there exists a set $\Z\subset \V$ that $d$-separates them. 

Two causal diagrams or MAGs are said to be Markov equivalent (ME) if they entail the same set of $d$-separations\footnote{The notion corresponding to $d$-separation in MAGs is called $m$-separation.}. A ME class of graphs can be summarized in a PAG that includes one additional edge tip $``\circ"$ that denotes undecidability, \textit{i.e.}, there are graphs in the equivalence class with both types of edge tips \citep{zhang2006causal,zhang2008causal}\footnote{Selection bias, typically represented with undirected edges \citep{zhang2008completeness} or extra variables is not considered in this paper.}. Directed edges $X \rightarrow Y$ in a MAG or PAG are said to be visible, denoted $X \xrightarrow{v} Y$, if  unobserved confounding can be ruled out. For example, the PAG in \Cref{fig:examples:b} encodes the ME class of the causal diagram in \Cref{fig:examples:a}. The output of the FCI algorithm is a PAG that can be recovered consistently under faithfulness \citep{spirtes2000causation,zhang2006causal,zhang2008causal}.

\textbf{Notation.} It will be useful to use standard graph-theoretic family abbreviations to represent graphical relationships in causal diagrams or equivalence classes. A path between $X$ and $Y$ is potentially directed (causal) from $X$ to $Y$ if there is no arrowhead on the path pointing towards $X$. $Y$ is called a possible descendant of $X$, \textit{i.e.}, $Y \in \texttt{PossDe}(X)$, and $X$ a possible ancestor of $Y$, \textit{i.e.}, $X \in \texttt{PossAn}(Y)$, if there is a potentially directed path from $X$ to $Y$. By stipulation, $X \in \texttt{PossAn}(X)$. A set $\X$ is ancestral if no node outside $\X$ is a possible ancestor of any node in $\X$. $X$ is called a possible parent of $Y$, \textit{i.e.}, $X \in \texttt{PossPa}(Y)$, and $Y$ a possible child of $X$, \textit{i.e.}, $Y \in \texttt{PossCh}(X)$, if they are adjacent and the edge is not into $X$. Further, $X$ is called a possible spouse of $Y$, \textit{i.e.}, $X \in \texttt{PossSp}(Y)$, if they are adjacent and the edge is not visible. For a set of nodes $\X$, we have $\texttt{PossPa}(\X) = \bigcup_{X\in\X} \texttt{PossPa}(X)$. %Given two sets of nodes $\X$ and $\Y$, a path between them is called proper if one of the endpoints is in $\X$ and the other is in $\Y$, and no other node on the path is in $\X$ or $\Y$. 
If the edge marks on a path between $X$ and $Y$ are all circles, we call the path a circle path. We refer to the closure of nodes connected with circle paths as a bucket. For example, in \Cref{fig:examples:b} $\{C,D\}$ is a bucket.


%\subsection{Causal reasoning with PAGs}
%\label{sec:identification}
%In this section, we review the techniques developed in \citep{jaber2019causal} for \emph{point-identifying} (conditional) causal effects. We start by defining a decomposition of variables $\V$ into so called $c$-components that forms the basis for systematic identification algorithms \cite[Lemma 11]{tian2002general}. In a causal diagram, two nodes are said to be in the same $c$-component $\C \subseteq \V$ if and only if they are connected by a bi-directed path, \textit{i.e.} composed entirely of edges of the type ``$\dashleftarrow\dashrightarrow$''. For any set $\C \subseteq \V$, $Q[\C]:= P_{\v\backslash\c}(\c)$ denotes the post-interventional distribution of $\C$ under an intervention on $\V\backslash\C$. By definition $Q[\V] = P(\v)$ and by convention $Q[\emptyset] = 1$. 

The notion of $pc$-component, defined below, generalizes that of $c$-components to equivalence classes and will be important for the proposed approach.

\begin{definition}[$pc$-component \citep{jaber2018causal}]
    In a PAG, or any induced sub-graph thereof, two nodes are in the same possible $c$-component ($pc$-component) if there is a path between them such that all non-endpoint nodes along the path are colliders, and none of the edges is visible.
\end{definition}

In words, the $pc$-component of a set $\A$ includes all the nodes which could, in some causal diagram, be in the $c$-component of some node in $\A$. Following this definition, \emph{e.g.}, $A,B$ and $X$ in \Cref{fig:examples:b} are in the same $pc$-component since $X$ is a collider on the path between them and none of the edges are visible. By contrast, $A$ and $C$ are not in the same $pc$-component since there is a visible edge on all paths that connect them. %In particular, it holds that $X$ and $Y$ are in the same $c$-component in a Markov equivalent causal diagram $\G$, then $X$ and $Y$ are in the same $pc$-component in its PAG. 
Using these notions, a causal effect of the form $Q[\C]$ can be decomposed into a product of smaller quantities, as shown in \Cref{prop:decomposition} using the Region construct. 

\begin{definition}[Region \cite{jaber2019causal}]
    \label{def:region}
     Given a PAG $\1P$ over $\V$, and $\A \subseteq \C \subseteq \V$. Let the region of $\A$ with respect to $\C$ be the union of the buckets that contain nodes in the $pc$-component of $\A$ in the sub-graph $\1P_\C$.
\end{definition}

\begin{proposition}[Thm. 1 \citep{jaber2019causal}]
    \label{prop:decomposition}
    Given a PAG $\1P$ over $\V$ and $\A\subset\C\subseteq\V$, let the region of $\A$ with respect to $\C$ be denoted $\1R_\A$. $Q[\C]$ can be decomposed as,
    \begin{align}
        Q[\C] = Q[\1R_\A] \cdot Q[\1R_{\C\backslash\A}] \hspace{0.1cm}/\hspace{0.1cm} Q[\1R_\A\inter\1R_{\C\backslash\A}].
    \end{align}
\end{proposition}

Identification of quantities $Q[\cdot]$ given an equivalence class $\1P$ uses a notion of (partial) topological order over the nodes in $\1P$. A partial topological order is defined on buckets rather than individual nodes therefore extending the notion used in single causal diagrams and is valid for all causal diagrams in the Markov equivalence class \cite[Lemma 1]{jaber2018causal}. For example, $A \prec B \prec X \prec \{C,D\}$, is a partial topological order over the buckets of $\1P$ in \Cref{fig:examples:b}. 

Conditional causal effects, of the form $P_\x(\y \mid \z)$, can be similarly be decomposed using the notion of $Q[\cdot]$ by the definition of conditional probability,
\begin{align}
    P_\x(\y \mid \z) = \sum_{\c\backslash\y}\left(Q[\C\union\Z] \hspace{0.1cm}/\hspace{0.1cm} \sum_{\c}Q[\C\union\Z]\right),
\end{align}
where $\C=\texttt{PossAn}(\Y\union\Z)_{\1P_{\V\backslash \X}}\backslash \X$. %Complete identification algorithms for both unconditional and conditional causal effects from a PAG exist based on the decomposition in \Cref{prop:decomposition} \citep{jaber2019causal, jaber2022causal}. 

The decompositions of causal effects into $pc$-components and partial topological orders play a critical role in systematic identification algorithms and will be important in our work.

