
\doublespacing

\renewcommand{\thefigure}{\arabic{figure}} % Adds the "S" prefix
\renewcommand{\thetable}{S\arabic{table}} % Add "S" prefix
\setcounter{lemma}{0}
% \setlength{\oddsidemargin}{0in}  %left margin position, reference is one inch
% \setlength{\textwidth}{6.5in}    %width of text=8.5-1in-1in for margin
% \setlength{\topmargin}{-0.5in}    %reference is at 1.5in, -.5in gives a start of about 1in from top
% \setlength{\textheight}{9in}     %length of text=11in-1in-1in (top and bot. marg.) 

\definecolor{background-color}{gray}{0.98}
\definecolor{backcolour}{rgb}{0.95,0.95,0.92}
\definecolor{codegreen}{rgb}{0,0.6,0}

% Define a custom style
\lstdefinestyle{myStyle}{
    backgroundcolor=\color{backcolour},   
    commentstyle=\color{codegreen},
    basicstyle=\ttfamily\footnotesize,
    breakatwhitespace=false,         
    breaklines=true,                 
    keepspaces=true,                 
    numbers=left,       
    numbersep=5pt,                  
    showspaces=false,                
    showstringspaces=false,
    showtabs=false,                  
    tabsize=2,
}

% Use \lstset to make myStyle the global default
\lstset{style=myStyle}


\newcommand\circleast{\mathrel{{\circ}\!{-}\!{\ast}}}
\newcommand\leftarrowast{\mathrel{{\leftarrow}\!{\ast}}}
\newcommand\rightarrowast{\mathrel{{\ast}\!{\rightarrow}}}
\newcommand\rightarrowcircle{\mathrel{{\circ}\!{\rightarrow}}}
\newcommand\leftarrowcircle{\mathrel{{\leftarrow}\!{\circ}}}


\newpage

\onecolumn

\title{Relational Causal Discovery with Latent Confounders\\(Supplementary Material)}
\maketitle


\begin{appendix}
\section{Background}
\subsection{Relational Data}\label{rcd}
In this subsection, we provide possible examples of relational data. Figure \ref{fig:schema} shows an example relational schema with two entities, USER (E) and POST (P), and the relationship between them, REACTS (P), with a MANY TO MANY cardinality, meaning users can react to multiple posts and vice versa. The USER type has three attributes: Type, Sentiment, and Activity, while the POST entity type has the attributes Content and Engagement. The relationship type REACTS instead has the attribute Frequency. 
\begin{figure}[ht]
    \centering
    \begin{tikzpicture}
        % Entities
        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm] (A) at (0, 0) {};
        \node[above] at (A.north) {USER};
    
        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm] (B) at (5, 0) {};
        \node[above] at (B.north) {POST};
    
                % Define the four corners of the rhombus
                \coordinate (N) at (2.5, 0.75);
                \coordinate (E) at (3.55, 0);
                \coordinate (S) at (2.5, -0.75);
                \coordinate (W) at (1.45, 0);
                \node[above] at (N.north) {REACTS};
                \node[above, font=\tiny] at ([xshift=-0.16cm] W.center) {MANY};
                \node[above, font=\tiny] at ([xshift=0.16cm] E.center) {MANY};
            
                % Create attributes inside A
                \node[draw, ellipse, thick, font=\scriptsize] (A1) at ([yshift=-0.66cm, xshift=-0.09cm] A.center) {Sentiment};
                \node[draw, ellipse, thick, font=\scriptsize] (A2) at ([xshift=-0.45cm, yshift=0.65cm] A.center) {Type};
                \node[draw, ellipse, thick, font=\scriptsize] (A3) at ([xshift=0.24cm, yshift=0cm] A.center) {Activity};
            
                % Create attributes inside AB1
                \node[draw, ellipse, font=\scriptsize, thick] (AB1_1) at ([yshift=-0.75cm] N.center) {Frequency};
                
                % Draw the lines to form the rhombus
                \draw (N) -- (E) -- (S) -- (W) -- cycle;
            
                \node[draw, ellipse, thick, font=\tiny]  (B1) at ([xshift=0cm, yshift=-0.45cm] B.center) {Engagement};
                \node[draw, ellipse, thick, font=\scriptsize] (B2) at ([xshift=0.1cm, yshift=0.5cm] B.center) {Content};
              
                % Lines
                \draw (A) -- (W);
                \draw (E) -- (B);
    
        % Curved edge
        % \path (A1) edge[bend right, thick, ->] (B1);
        % \path (B2) edge[bend right, thick, ->] (B1);
        % \path (AB1_1) edge[bend left, thick, ->] (A3);
        % \path (AB1_1) edge[bend right, thick, ->] (B1);
        % \path (A2) edge[bend right, thick, ->] (A1);
        % \path (A2) edge[bend left, thick, ->] (A3);
    \end{tikzpicture}
    \caption{Example of Relational Schema}
    \label{fig:schema}
\end{figure}

An example of an instantiation of the depicted relational schema can be seen in figure \ref{fig:skeleton-SI}. For simplicity, attributes are left with the original placeholder for each entity and relationship instance. As an example, the skeleton contains three instantiations of the USER entity, Bob, Anna, and Andrea, and four instantiations of the POST entity type, Food recipe, Meme, Poem, and News. Bob and Anna react to the Food recipe and Meme, while Andrea reacts to the Poem and News. It is important to note that this skeleton is coherent with the cardinality requirements (i.e., MANY TO MANY) of the relationship defined in the schema.

\begin{figure}[ht]
    \centering
    \begin{tikzpicture}
        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm, thick] (E1) at (0, 0) {};
        \node[above] at (E1.north) {Bob};

        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm, thick] (E2) at (5, 0) {};
        \node[above] at (E2.north) {Anna};
    
        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm, thick] (P1) at (0, -5) {};
        \node[above] at (P1.north) {Food Recipe};

        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm, thick] (P2) at (5, -5) {};
        \node[above] at (P2.north) {Meme};

        \coordinate (N1) at (2.5, -1.5);
        \coordinate (ES1) at (3.55, -2.25);
        \coordinate (S1) at (2.5, -3);
        \coordinate (W1) at (1.45, -2.25);
        \node[above] at (N1.north) {Reacts};
        % \node[above, font=\tiny] at ([xshift=-0.16cm] W.center) {MANY};
        % \node[above, font=\tiny] at ([xshift=0.16cm] E.center) {MANY};
    
        \node[draw, ellipse, thick, font=\scriptsize] (A1) at ([yshift=-0.66cm, xshift=-0.09cm] A.center) {Sentiment};
        \node[draw, ellipse, thick, font=\scriptsize] (A2) at ([xshift=-0.45cm, yshift=0.65cm] A.center) {Type};
        \node[draw, ellipse, thick, font=\scriptsize] (A3) at ([xshift=0.24cm, yshift=0cm] A.center) {Activity};

        \node[draw, ellipse, thick, font=\scriptsize] (A1) at ([yshift=-0.66cm, xshift=-0.09cm] E2.center) {Sentiment};
        \node[draw, ellipse, thick, font=\scriptsize] (A2) at ([xshift=-0.45cm, yshift=0.65cm] E2.center) {Type};
        \node[draw, ellipse, thick, font=\scriptsize] (A3) at ([xshift=0.24cm, yshift=0cm] E2.center) {Activity};
    
        \node[draw, ellipse, font=\scriptsize, thick] (loc_1) at ([yshift=-0.75cm] N1.center) {Frequency};
            
        \node[draw, ellipse, thick, font=\tiny]  (B1) at ([xshift=0cm, yshift=-0.45cm] P1.center) {Engagement};
        \node[draw, ellipse, thick, font=\scriptsize] (B2) at ([xshift=0.1cm, yshift=0.5cm] P1.center) {Content};

        \node[draw, ellipse, thick, font=\tiny]  (B1) at ([xshift=0cm, yshift=-0.45cm] P2.center) {Engagement};
        \node[draw, ellipse, thick, font=\scriptsize] (B2) at ([xshift=0.1cm, yshift=0.5cm] P2.center) {Content};

        \draw[thick] (N1) -- (ES1) -- (S1) -- (W1) -- cycle;

        \draw (W1) -- (E1);
        \draw (ES1) -- (E2);
        \draw (W1) -- (P1);
        \draw (ES1) -- (P2);

        %%%%%%%%%%%%%%%%%%%%%
        %%%%%%%%%%%%%%%%%%%%%
        %%%%%%%%%%%%%%%%%%%%%

        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm, thick] (E3) at (12.5, 0) {};
        \node[above] at (E3.north) {Andrea};
    
        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm, thick] (P3) at (10, -5) {};
        \node[above] at (P3.north west) {Poem};

        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm, thick] (P4) at (15, -5) {};
        \node[above] at (P4.north east) {News};

        \coordinate (N2) at (10, -1.75);
        \coordinate (ES2) at (11, -2.5);
        \coordinate (S2) at (10, -3.25);
        \coordinate (W2) at (9, -2.5);
        \node[above] at (N2.north) {Reacts};

        \coordinate (N3) at (15, -1.75);
        \coordinate (ES3) at (16, -2.5);
        \coordinate (S3) at (15, -3.25);
        \coordinate (W3) at (14, -2.5);
        \node[above] at (N3.north) {Reacts};
        % \node[above, font=\tiny] at ([xshift=-0.16cm] W.center) {MANY};
        % \node[above, font=\tiny] at ([xshift=0.16cm] E.center) {MANY};
    
        \node[draw, ellipse, thick, font=\scriptsize] (A1) at ([yshift=-0.66cm, xshift=-0.09cm] E3.center) {Sentiment};
        \node[draw, ellipse, thick, font=\scriptsize] (A2) at ([xshift=-0.45cm, yshift=0.65cm] E3.center) {Type};
        \node[draw, ellipse, thick, font=\scriptsize] (A3) at ([xshift=0.24cm, yshift=0cm] E3.center) {Activity};
    
        \node[draw, ellipse, font=\scriptsize, thick] (loc_2) at ([yshift=-0.75cm] N2.center) {Frequency};

        \node[draw, ellipse, font=\scriptsize, thick] (loc_3) at ([yshift=-0.75cm] N3.center) {Frequency};
            
        \node[draw, ellipse, thick, font=\tiny]  (B1) at ([xshift=0cm, yshift=-0.45cm] P3.center) {Engagement};
        \node[draw, ellipse, thick, font=\scriptsize] (B2) at ([xshift=0.1cm, yshift=0.5cm] P3.center) {Content};

          \node[draw, ellipse, thick, font=\tiny]  (B1) at ([xshift=0cm, yshift=-0.45cm] P4.center) {Engagement};
        \node[draw, ellipse, thick, font=\scriptsize] (B2) at ([xshift=0.1cm, yshift=0.5cm] P4.center) {Content};

        \draw[thick] (N2) -- (ES2) -- (S2) -- (W2) -- cycle;
        \draw[thick] (N3) -- (ES3) -- (S3) -- (W3) -- cycle;

        \draw (ES2) -- (E3);
        \draw (W3) -- (E3);
        \draw (S2) -- (P3);
        \draw (S3) -- (P4);

    \end{tikzpicture}
    \caption{Example of Relational Skeleton}
    \label{fig:skeleton-SI}
\end{figure}

Given the relational skeleton provided and the relational dependencies provided in the relational causal model in figure \ref{fig:model}, it is possible to obtain the corresponding ground graph, shown in figure \ref{fig:groudgraph-SI}. The nodes on the ground graph represent the attributes of every single entity and relationship instance in the skeleton. In contrast, the edges represent the dependencies in the relational causal model applied to the attribute instances of the relational skeleton. For example, the relational dependency $[P, R, U].Sentiment \rightarrow [P].Engagement$ in the model, which indicates that a post's engagement depends on the user's reaction to the product, is represented in the ground graph with the following edges: Bob.Sentiment $\rightarrow$ Food\_Recipe.Engagement, Bob.Sentiment $\rightarrow$ Meme.Engagement, Anna.Sentiment $\rightarrow$ Food\_Recipe.Engagement, Anna.Sentiment $\rightarrow$ Food\_Recipe.Engagement, Andrea.Sentiment $\rightarrow$ Poem.Engagement, Andrea.Sentiment $\rightarrow$ News.Engagement. 

\begin{figure}[ht]
    \centering
    \scalebox{0.5}{
        \begin{tikzpicture}
            % Nodes
            \node[draw, ellipse, thick] (1) at (0, 0) {Bob.Type};
            \node[draw, ellipse, thick] (2) at (-3, -2) {Bob.Activity};
            \node[draw, ellipse, thick] (3) at (3, -2) {Bob.Sentiment};
            \node[draw, ellipse, thick] (4) at (0, -4) {Assign.Frequency};
            \node[draw, ellipse, thick] (5) at (3, -6) {Food\_recipe.Engagement};
            \node[draw, ellipse, thick] (6) at (3, -8) {Food\_recipe.Content};
            \node[draw, ellipse, thick] (7) at (11, 0) {Anna.Type};
            \node[draw, ellipse, thick] (8) at (8, -2) {Anna.Activity};
            \node[draw, ellipse, thick] (9) at (14, -2) {Anna.Sentiment};
            \node[draw, ellipse, thick] (10) at (11, -6) {Meme.Engagement};
            \node[draw, ellipse, thick] (11) at (11, -8) {Meme.Content};

            \node[draw, ellipse, thick] (12) at (22, 0) {Andrea.Type};
            \node[draw, ellipse, thick] (13) at (19, -2) {Andrea.Activity};
            \node[draw, ellipse, thick] (14) at (25, -2) {Andrea.Sentiment};
            \node[draw, ellipse, thick] (15) at (19, -4) {Assign.Frequency};
            \node[draw, ellipse, thick] (16) at (25, -4) {Assign.Frequency};
            \node[draw, ellipse, thick] (17) at (19, -6) {Poem.Engagement};
            \node[draw, ellipse, thick] (18) at (19, -8) {Poem.Content};
            \node[draw, ellipse, thick] (19) at (25, -6) {News.Engagement};
            \node[draw, ellipse, thick] (20) at (25, -8) {News.Content};

            \path (1) edge[thick, ->] (2);
            \path (1) edge[thick, ->] (3);
            \path (3) edge[thick, ->] (5);
            \path (3) edge[thick, ->] (10);
            \path (4) edge[thick, ->] (2);
            \path (4) edge[thick, ->] (5);
            \path (4) edge[thick, ->] (8);
            \path (4) edge[thick, ->] (10);
            \path (6) edge[thick, ->] (5);
            \path (7) edge[thick, ->] (8);
            \path (7) edge[thick, ->] (9);
            \path (9) edge[thick, ->] (5);
            \path (9) edge[thick, ->] (10);
            \path (11) edge[thick, ->] (10);

            \path (12) edge[thick, ->] (13);
            \path (12) edge[thick, ->] (14);
            \path (14) edge[thick, ->] (17);
            \path (14) edge[thick, bend left=80, ->] (19);
            \path (15) edge[thick, ->] (13);
            \path (15) edge[thick, ->] (17);
            \path (16) edge[thick, ->] (13);
            \path (16) edge[thick, ->] (19);
            \path (18) edge[thick, ->] (17);
            \path (20) edge[thick, ->] (19);

        \end{tikzpicture}
    }
    \caption{Example of Ground Graph}
    \label{fig:groudgraph-SI}
\end{figure}

Beyond the specific instantiation of the ground graph, to perform relational causal discovery, it is necessary to define an abstract ground graph that generalizes the structure of dependencies without referring to particular entities or relationship instances. The abstract ground graph represents the relational dependencies in the relational causal model at a higher level, capturing attribute interactions without being tied to a specific skeleton. In this representation, nodes correspond to attribute types rather than instances, while edges represent the abstract relational dependencies in the model \citep{maier2014reasoning}. 

To construct the abstract ground graph from a given relational causal model, it is necessary to project the dependencies onto the relevant perspective. The \verb|extend| method devised by \citet{maier2014reasoning} achieves this by mapping underlying relational dependencies into the set of edges in the abstract ground graphs. Below we provide the formula of the \verb|extend| method:
\begin{align*}
    \text{extend}(P_{\text{orig}}, P_{\text{ext}}) &= \left\{ 
P = P_{\text{orig}}^{1, n_o - i + 1} + P_{\text{ext}}^{i+1, n_e} 
\,\middle|\, 
i \in \text{pivots}(\text{reverse}(P_{\text{orig}}), P_{\text{ext}}) 
\wedge \text{validPath}(P) 
\right\} \\
\\
\text{pivots}(P_1, P_2) &= \left\{ i \,\middle|\, P_1^{1,i} = P_2^{1,i} \right\}
\end{align*}
Where $\text{validPath}(P)$ checks that the relational path is valid with the respect to the schema and its relationships' cardinalities. \\
Each abstract ground graph edge $
[B, \ldots, I_k].Y \rightarrow [B, \ldots, I_j].X$ is then constructed from the underlying dependency $[I_j, \ldots, I_k].Y \rightarrow [I_j].X$ with the following logic:
\begin{align*}
    \left\{
[B, \ldots, I_k].Y \rightarrow [B, \ldots, I_j].X \,\middle|\,
[I_j, \ldots, I_k].Y \rightarrow [I_j].X \in \mathcal{D} \,\wedge\,
[B, \ldots, I_k] \in \text{extend}([B, \ldots, I_j], [I_j, \ldots, I_k])
\right\}
\end{align*}
For example, the relational dependency $[P, R, U].Sentiment \rightarrow [P].Engagement$, which in the ground graph manifests as instance-specific edges (e.g., Bob.Sentiment $ \rightarrow $ Food\_Recipe.Engagement), is represented in the abstract ground graph for the perspective USER with the directed edges $[U].Sentiment \rightarrow [U, R, P].Engagement$ and $[U, R, P, R, U].Sentiment \rightarrow [U, R, P].Engagement$. 

Similarly, other relational dependencies in the model are reflected as edges between attribute types in the abstract ground graph, providing a compact and generalized view of how information propagates through the relational structure. Analyzing the abstract ground graph makes it possible to reason about potential influences and dependencies at the schema level without requiring explicit enumeration of individual instances.

\begin{figure}{\textwidth}
    \centering
        \begin{tikzpicture}
            % Nodes
            \node[draw, ellipse, thick] (1) at (-1, -1.5) {[U].Sentiment};
            \node[draw, ellipse, thick] (2) at (-1, 0) {[U].Type};
            \node[draw, ellipse, thick] (3) at (-1, 1.5) {[U].Activity};
            \node[draw, ellipse, thick] (4) at (2.5, 0) {[U, R, P].Engagement};
            \node[draw, ellipse, thick] (5) at (2.75, -1.5) {[U, R, P].Content};
            \node[draw, ellipse, thick] (6) at (7.5, -1.5) {[U, R, P, R, U].Sentiment};
            \node[draw, ellipse, thick] (7) at (7.5, 0) {[U, R, P, R, U].Type};
            \node[draw, ellipse, thick] (8) at (7.5, 1.5) {[U, R, P, R, U].Activity};
            \node[draw, ellipse, thick] (9) at (2.5, 1.5) {[U, R].Frequency};

            \path (2) edge[thick, ->] (1);
            \path (2) edge[thick, ->] (3);
            \path (9) edge[thick, ->] (3);
            \path (9) edge[thick, ->] (4);
            \path (1) edge[thick, ->] (4);
            \path (5) edge[thick, ->] (4);
            
            \path (7) edge[thick, ->] (6);
            \path (7) edge[thick, ->] (8);
            \path (9) edge[thick, ->] (8);
            \path (6) edge[thick, ->] (4);
        \end{tikzpicture}
    \caption{Example of Abstract Ground Graph for perspective USER}
    \label{fig:agg}
\end{figure}
\subsection{MAGs and PAGs}\label{fci}
In this subsection, we provide an example of how more than one MAG can be a member of the same PAG and single equivalency class. Given a collection of observable variables, let \textbf{Cond} in figure \ref{fig:pag1} represent the set of conditional dependencies. It is evident that it is entailed by several DAGs. Figure \ref{fig:pag2} displays the PAG that was generated for \textbf{Cond}. Since they are not mentioned in the conditional set, A and D's edge marks are ◦, which could lead to different marks for various DAGs in \textit{O-Equiv}(\textbf{Cond}).
\begin{figure}[h!]
    \centering
    \begin{subfigure}{0.45\columnwidth}
        \centering
        % \includegraphics[width=\linewidth]{imgs/pag1.png}
        \scalebox{0.7}{
            \begin{tikzpicture}
                % Define points
                \node (A1) at (0,0) {A};
                \node (B1) at (0.8,0) {B};
                \node (C1) at (1.6,0) {C};
                \node (D1) at (2.4,0) {D};
                \node[draw] (L1_1) at (1.2, 0.8) {L1};
    
                % \node (A2) at (0,-1.5) {A};
                % \node (B2) at (0.8,-1.5) {B};
                % \node (C2) at (1.6,-1.5) {C};
                % \node (D2) at (2.4,-1.5) {D};
                % \node[draw] (L1_2) at (1.2, -0.7) {L1};
                % \node[draw] (L2_2) at (1.2, -2.3) {L2};

                \node (A2) at (3,0) {A};
                \node (B2) at (3.8,0) {B};
                \node (C2) at (4.6,0) {C};
                \node (D2) at (5.4,0) {D};
                \node[draw] (L1_2) at (4.2, 0.8) {L1};
                \node[draw] (L2_2) at (4.2, -0.8) {L2};
            
                % Connect points
                \draw[thick, -{Stealth[round]}] (A1) -- (B1);
                \draw[thick, -{Stealth[round]}] (L1_1) -- (B1);
                \draw[thick, -{Stealth[round]}] (L1_1) -- (C1);
                \draw[thick, -{Stealth[round]}] (D1) -- (C1);
    
                \draw[thick, -{Stealth[round]}] (A2) -- (B2);
                \draw[thick, -{Stealth[round]}] (L1_2) -- (B2);
                \draw[thick, -{Stealth[round]}] (L1_2) -- (C2);
                \draw[thick, -{Stealth[round]}] (L2_2) -- (B2);
                \draw[thick, -{Stealth[round]}] (L2_2) -- (C2);
                \draw[thick, -{Stealth[round]}] (D2) -- (C2);
    
                % Dependencies
                \node[align=left] at (2.7,2) {\{ \{D\} $\perp$ \{A, B\},\\\hspace{0.26cm}\{A\} $\perp$ \{C, D\} \}};
            \end{tikzpicture}
        }
        \caption{DAGs in same O-Equiv(Cond) class}
        \label{fig:pag1}
    \end{subfigure}%
    \hfill
    \begin{subfigure}{0.45\columnwidth}
        \centering
        \begin{tikzpicture}
            % Define points
            \node (A) at (0,0) {A};
            \node (B) at (0.5, -1.5) {B};
            \node (C) at (2, -1.5) {C};
            \node (D) at (2.5,0) {D};
        
            % Connect points
            \draw[thick, {Circle[open]}-{Stealth[round]}] (A) -- (B);
            \draw[thick, {Stealth[round]}-{Stealth[round]}] (B) -- (C);
            \draw[thick, {Circle[open]}-{Stealth[round]}] (D) -- (C);
        \end{tikzpicture}
        
        \caption{Resulting PAG for O-Equiv(Cond) class}
        \label{fig:pag2}
    \end{subfigure}
    \caption{DAGs in the same observational equivalence class under \textbf{Cond} (a), and the resulting PAG (b) that captures shared structure and uncertainty in edge directions.}
    \label{fig:both_images}
\end{figure}
% \subsection{Assumptions for Relational Causal Discovery}\label{assumptions}

% In this subsection, we define and discuss some key assumptions used for causal discovery in relational data, including the maximum hop threshold, d-faithfulness, acyclicity, and causal sufficiency.
% \begin{itemize}
%     \item Maximum Hop Threshold (\(h\)): The maximum hop threshold defines the largest permissible path length (or number of relational hops) between entities in a relational causal model that will be considered when constructing causal dependencies. Setting \(h\) limits the computational complexity and ensures that the discovered relationships are both interpretable and relevant. For instance, in a social network, \(h = 2\) might capture direct friendships and friends-of-friends relationships while ignoring more distant connections.
%     \item D-Faithfulness: D-faithfulness (Dependency-Faithfulness) posits that any conditional independence observed in the data is also represented in the underlying causal graph, and vice versa. This ensures that the causal relationships inferred from the data align with the observed statistical dependencies in the relational causal model.
%     \item Acyclicity: Acyclicity mandates that the causal graph representing the relationships among variables and entities is a directed acyclic graph (DAG). This means there are no directed cycles in the relational causal model.
%     \item Causal Sufficiency: The assumption of causal sufficiency states that all common causes of the observed variables are measured and included in the data. This implies the absence of latent confounders that could induce spurious associations.
% \end{itemize}


\section{RelFCI Rules}\label{rules}
This section outlines every rule we apply to the new Partial Ancestral Abstract Ground Graph representation to obtain a maximally informative graph and, thus, an underlying model. We introduce the rules in the framework of PAAGGs, where any ◦ marks represent unoriented edges and $\ast$ denotes any edge mark.
\subsection{RCD Rules}
RCD \citep{maier2013sound} performs relational causal discovery using a similar strategy to the Poem algorithm, extended with the RBO purely common cause rule.
The edges of the abstract ground graph are oriented using the following set of rules:
\begin{enumerate}
    \item Collider Detection (CD): For each triple $\langle\alpha,\beta,\gamma\rangle$, if $\beta$ is not in the set that separates $\alpha$ and $\gamma$, orient it as $\alpha\rightarrowast\beta\leftarrowast\gamma$;
    \item Relational Bivariate Orientation (RBO): Let $\mathcal{M}$ be a relational causal model  and $G$ a partially directed PAAGG for $\mathcal{M}$ for perspective $I_X$, and let there be an unshielded triple in $G$ $\alpha$◦\textemdash◦$\beta$◦\textemdash◦$\gamma$ with $\alpha=[I_X].X, \beta=[I_X,...,I_Y].Y, \gamma=[I_X,...,I_Y,...,I_X].X$. If $\textit{card}([I_Y,...,I_X])=\verb|MANY|$ and $\alpha\independent\gamma\rvert\mathbf{Z}$, then if $\beta\in\textbf{Z}$, orient the triple as $\alpha\leftarrowcircle\beta\rightarrowcircle\gamma$;
    \item Known Non-Colliders (KNC): If $\alpha\rightarrowast\beta$◦\textemdash$\ast\gamma$, with $\alpha,\gamma$ not adjacent, orient the triple as $\alpha\rightarrowast\beta\rightarrow\gamma$
    \item Cycle Avoidance (CA): If either $\alpha\rightarrow\beta\rightarrowast\gamma$ or $\alpha\rightarrowast\beta\rightarrow\gamma$, with $\alpha\ast$\textemdash◦$\gamma$, orient the latter as $\alpha\rightarrowast\gamma$;
    \item Meek Rule 3 (MR3): If both $\alpha\rightarrowast\beta\leftarrowast\gamma$ and $\alpha\ast$\textemdash◦$\theta$◦\textemdash$\ast\gamma$, with $\alpha,\gamma$ not adjacent and $\theta\ast$\textemdash◦$\beta$, then orient the latter as $\theta\rightarrowast\beta$.
\end{enumerate}
\subsection{FCI Rules}

FCI \citep{ZHANG20081873} constructs a causal graph starting from a fully connected undirected graph with ◦ marks and removes edges between conditionally dependent variables. In the second phase, it orients edges by identifying colliders and "Y" structures. The remaining edges are then oriented according to a set of additional rules: 
\begin{enumerate}
    \setcounter{enumi}{3}
    \item If $u=\langle\theta,...,\alpha,\beta,\gamma\rangle$ is a discriminating path and $\beta\ast$\textemdash$\gamma$, if $\beta\in\textit{SepSet}(\theta,\gamma)$ orient $\beta\rightarrow\gamma$, otherwise orient $\alpha\leftrightarrow\beta\leftrightarrow\gamma$;
    \item For every (remaining) $\alpha$◦\textemdash◦$\beta$, if there is an uncovered path $p = \langle\alpha, \gamma,...,\theta,\beta\rangle$ s.t. all edges are ◦\textemdash◦ and $\alpha, \theta$ are
    not adjacent and $\beta,\gamma$ are not adjacent, then orient all edges in the path as \textemdash;
    \item If $\alpha$\textemdash$\beta$◦\textemdash$\ast\gamma$, with $\alpha, \gamma$ either adjacent or not, orient $\beta$\textemdash$\ast\gamma$;
    \item If $\alpha$\textemdash◦$\beta$◦\textemdash$\ast\gamma$, and $\alpha, \gamma$ are not adjacent, orient $\beta$\textemdash$\ast\gamma$;
    \item If $\alpha$\textemdash◦$\beta\rightarrow\gamma$ or $\alpha$\textemdash◦$\beta\rightarrow\gamma$, and $\alpha$◦$\rightarrow\gamma$, orient  $\alpha\rightarrow\gamma$;
    \item If $\alpha\rightarrowcircle\gamma$ and $p = \langle\alpha,\beta,  \theta,...,\gamma\rangle$ is an uncovered path s.t. $\beta$ and $\gamma$ are not adjacent, orient $\alpha\rightarrow\gamma$;
    \item If $\alpha\rightarrowcircle\gamma$, $\beta\rightarrow\gamma\leftarrow\theta$, and $p_1,p_2$ are uncovered p.d. paths from $\alpha$ to $\beta$ and from $\alpha$ to $\theta$, let $\mu$ and $\omega$ be the adjacent nodes of $\alpha$ on $p_1,p_2$. If $\mu$ and $\omega$ are distinct, orient $\alpha\rightarrow\gamma$.
\end{enumerate}

\section{Algorithms}\label{algo}
The following section provides more detailed pseudocode for each step in the main algorithm. The described algorithm and steps are adapted from the implementation provided in \citet{colombo2012learning}. For easy reference, the main RelFCI pseudocode is provided again below in Algorithm \ref{alg:main}.

\begin{algorithm}[htb]
\caption{RelFCI algorithm}
\label{alg:main}
\textbf{Input}: schema, oracle,\\
\textbf{Parameter}: threshold\\
\textbf{Output}: Dependencies
\begin{algorithmic}[1] %[1] enables line numbers
\STATE \textit{// Step 1: Graphs initialization}
\STATE $PDs \gets$ get potential Dependencies from the base schema (with no dependencies) and two times the threshold (2*h)
\STATE $PAAGGs \gets$ construct PAAGGs from potential dependencies set $PDs$
\STATE $S \gets \{\}$ \\
\STATE \textit{// Step 1: Independent Variables identification, storing separating sets and unshielded triples}
\STATE $PAAGGs, S, U \gets \text{obtainInitialSkeleton}(PAAGGs, S)$ 
%  \\
% \verb|// Find and orient v-structures|
\STATE \textit{// Step 2: V-structures orientation using CD, starting from unshielded triples in $U$}
\STATE $PAAGGS, S \gets \text{orientVStructures}(PAAGGs, S, U)$  
% \verb|// Apply orientations rules from RCD, FCI| \\
\STATE \textit{// Step 3: edges orientation using rules from RCD and additional ones from FCI}
\STATE $PAAGGs, S \gets \text{performEdgeOrientation}(PAAGGs, S)$
\STATE $Deps \gets$ retrieve underlying dependencies from oriented PAAGGs edges
% \verb|retrieveDepsFromPAAGGs|(PAAGGs)$
\STATE \textbf{return} Deps
\end{algorithmic}
\end{algorithm}



\begin{algorithm}[htb]
\caption{obtainInitialSkeleton}
\label{alg:step_1}
\textbf{Input}: Schema, Oracle,\\
\textbf{Parameter}: threshold, depth\\
\textbf{Output}: Non-oriented AGGs
\begin{algorithmic}[1] %[1] enables line numbers

\FOR{$agg$ \textbf{in} AGGs}
    \STATE Let $l=0$
    \STATE Let $max\_depth=agg.number\_of\_nodes - 2$
    \WHILE{$l \leq max\_depth$}
        \FORALL{pair of vertices ($X_i$, $X_j$) in $agg$}
            \STATE Let $C = agg.nodes - \{X_i, X_j\}$
            \FORALL{$Y \subseteq C$}
                \IF {CITest($X_i$, $X_j$, $Y$)}
                    \STATE Remove dependencies between ($X_i$, $X_j$)
                    \STATE Store $Y$ as $sepSet$ for ($X_i$, $X_j$)
                \ENDIF
            \ENDFOR
        \ENDFOR
        \STATE Let $l = l + 1$
    \ENDWHILE

    \STATE
    \FORALL{triple of vertices ($X_k$, $X_j$, $X_m$) in $agg$}
        \IF{$k < m$}
            \IF{$agg.has\_edge(X_k, X_j)$ \textbf{and} $agg.has\_edge(X_j, X_m)$ \textbf{and not} $agg.has\_edge(X_k, X_m)$}
                \STATE Append ($X_k$, $X_j$, $X_m$) to $unshieldedTriples[agg]$ 
            \ENDIF
        \ENDIF
    \ENDFOR
\ENDFOR
\end{algorithmic}
\end{algorithm}


\begin{algorithm}[htb]
\caption{orientVStructures}
\label{alg:step_2}
\textbf{Input}: Schema, Oracle,\\
\textbf{Parameter}: threshold, depth\\
\textbf{Output}:  Partially oriented AGGs
\begin{algorithmic}[1] %[1] enables line numbers
\FOR{$agg$ \textbf{in} AGGs}
    \WHILE{$unshieldedTriples[agg]$}
        \STATE Let $(X_i, X_j, X_k) = unshieldedTriples[agg].pop()$
        \STATE Let $Z = sepSet(X_i, X_k) - \{ X_j\}$
        \IF {\textbf{not} CITest($X_i$, $X_j$, $Z$) \textbf{and} \textbf{not} CITest($X_j$, $X_k$, $Z$)}
            \STATE Append ($X_i$, $X_j$, $X_k$) to $dependentTriples[agg]$ 
        \ELSE
            \FOR{$X_r$ \textbf{in} [$X_i$, $X_k$]}
                \IF {CITest($X_r$, $X_j$, $Z$)}
                    \STATE Let $Y = findMinimalSepset(X_r, X_j, Z)$
                    \STATE Store $Y$ as $sepSets$ for ($X_r$, $X_j$)
                    \FORALL{$X_x$ \textbf{in} $agg.nodes$}
                        \IF{isTriangle($X_{min(r,j)}, \cdot , X_{max(r,j)}$)}
                            \STATE Add to $unshieldedTriples[agg]$ the triple
                        \ENDIF
                    \ENDFOR
                    \FORALL{triple \textbf{in} $unshieldedTriples[agg]$}
                        \STATE Delete the triple if matches one of the following patterns: $(X_r, X_j, \cdot )$, $(X_j, X_r, \cdot )$, $( \cdot, X_j, X_r)$ and $(\cdot, X_r, X_j)$
                    \ENDFOR
                    \STATE Remove dependencies between ($X_r$, $X_j$)
                \ENDIF
            \ENDFOR
        \ENDIF 
    \ENDWHILE

    \FORALL{triple \textbf{in} $dependentTriples[agg]$}
        \STATE Let $X_i, X_j, X_k = triple$
        \IF{$X_j$ \textbf{not in} $sepSets(X_i, X_k)$ \textbf{and} $agg.has\_edge(X_i, X_j)$ \textbf{and} $agg.has\_edge(X_j, X_k)$}
            \STATE Orient the triple as a collider
        \ENDIF
    \ENDFOR
\ENDFOR
\end{algorithmic}
\end{algorithm}


\begin{algorithm}[htb]
\caption{performEdgeOrientation}
\label{alg:step_3}
\textbf{Input}: Schema, Oracle,\\
\textbf{Parameter}: threshold, depth\\
\textbf{Output}:  Maximum oriented AGGs
\begin{algorithmic}[1] %[1] enables line numbers
\FOR{$agg$ \textbf{in} AGGs}
    \WHILE{AGG is updated}
        \STATE Orient as many edges as possible by applying RBO rule
        \STATE Orient as many edges as possible by applying FCI\_1 - FCI\_3 rules
        \FORALL{possible triples}
            \STATE Let $X_l, X_j, X_k = triple$
            \IF{$isTriangle(X_l, X_j, X_k)$ \textbf{and} $X_j  \circleast X_k$ \textbf{and} $X_l \leftarrowast X_j$ and $X_l \rightarrow X_k$}
                \STATE Find Minimal Discriminating Path for the triple
                \IF{minimalDiscriminatingPath}
                    \FORALL{adjacent couples}
                        \STATE Let $X_r, X_q = couple$
                        \STATE Let $otherSepSet = sepSets(X_i, X_k) - {X_r, X_q}$
                        \STATE Let $l = -1$
                        \WHILE{$|otherSepSet| \geq l$}
                            \STATE Let $l = l + 1$
                            \FORALL{$Y \subseteq otherSepSet$ \textbf{and} $|Y| = l$}
                                \IF{CITest($X_r$, $X_q$, $Y$)}
                                    \STATE Store $Y$ as $sepSet$ for ($X_r$, $X_q$)
                                    \FORALL{$X_x$ \textbf{in} $agg.nodes$}
                                        \IF{isTriangle($X_{min(r,j)}, \cdot , X_{max(r,j)}$)}
                                            \STATE Add to $unshieldedTriples[agg]$ the triple
                                        \ENDIF
                                    \ENDFOR
                                    \STATE Remove dependencies between ($X_r$, $X_q$)
                                    \STATE Execute Algorithm 2
                                \ENDIF
                            \ENDFOR
                        \ENDWHILE
                    \ENDFOR
                    \IF{Still adjacent \textbf{and} $X_j$ \textbf{in} $sepSets(X_i, X_k)$}
                        \STATE Orienting $X_j \rightarrow X_k$
                    \ELSIF{Still adjacent}
                        \STATE Orienting $X_l \leftrightarrow X_j \leftrightarrow X_k$
                    \ENDIF
                \ENDIF
            \ENDIF
        \ENDFOR
        \STATE Orient as many edges as possible by applying FCI\_5 - FCI\_10 rules
    \ENDWHILE
\ENDFOR

\end{algorithmic}
\end{algorithm}

\clearpage

\section{Possible Dependencies}\label{deps}
The presence of ◦ marks in the edge of $PAAGGs$, and thus in the underlying $PARM$, implies that the \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$) class contains different relational causal models. The algorithm's output is not the exact relational causal model that generates the data. RelFCI returns an equivalence class containing the model responsible for the data causal relationships. 
RelFCI computes conditional independence tests among the variables, thus possibly producing the same result with different underlying topologies e.g., with the independence fact $A \independent C \mid B$, the nodes A, B, and C can be correctly oriented as follows: $A \rightarrow B \rightarrow C$,  $A \leftarrow B \rightarrow C$,  $A \leftarrow B \leftarrow C$, $ A \rightarrow B \leftarrow C$ \citep{spirtes2000causation}.
RelFCI works by learning the edges' orientation of each $PAAGG$, which are defined by underlying relational dependencies. 

When the algorithm concludes and collects all the information learned to produce the $PARM$, the remaining ◦ marks lose significance in terms of relational dependencies. 
The definition of relational dependency in canonical form implies a natural orientation, i.e., $[I_X...I_Y].Y \rightarrow [I_X].X$. Orienting dependencies the other way around is an infraction of the definition, i.e., $[I_X].X \nrightarrow [I_X...I_Y].Y$. 
For this reason, given this formalization of the problem, we differentiate the information the algorithm learns by clearly stating which relational dependencies are required to define the $PARM$ and which are instead allowed. 
We define the required relational dependencies with a $\rightarrow$, i.e., $[I_X...I_Y].Y \rightarrow [I_X].X$ and the ones that are allowed but not necessary with a $\leadsto$, i.e., $[I_X...I_Y].Y \leadsto [I_X].X$. We will refer to the latter as \textit{Possible Dependencies}.

\section{PAAGG edge orientation}\label{edges}
We apply the four PC rules and the new RBO rule, described in RCD, and further apply the rules of FCI, as defined by Zhang (2008), adapted for the PAAGG representation. A latent relational causal model consists of a set of AGGs, one for each perspective, derived from the same set of relational dependencies $\mathcal{D}$. Similarly, both MAAGGs and PAAGGs are derived from the same collection of observed relational dependencies $\mathcal{D}_\textbf{O}$. In classical AGGs, activating a rule in a certain abstract ground graph involves propagating the orientation of the underlying dependency across all AGGs \citep{maier2013sound}. 

Consider a PARM $\mathbfcal{M}$ defined over the set of dependencies $\mathcal{D}_\textbf{O}$ and its corresponding PAAGG $G$ for the perspective $\mathcal{B}$. Let $\alpha=[\mathcal{B},..., I_X].X$ and $\gamma=[\mathcal{B},..., I_Y].Y$ be two nodes in $G$, $\alpha-\gamma$ be a bidirected edge in $G$, and $d_1=[I_X, ..., I_Y].Y\rightarrow[I_X].X\in\mathcal{D}_\textbf{O}$ be the underlying dependency that yields the left direction of the edge.
The FCI rules can orient a PAG edge with three edge marks: ◦, \textemdash, and $\rightarrow$. We apply these orientations to the PAAGG using the following logic:
\begin{itemize}
    \item The orientation $\alpha$◦\textemdash$\gamma$ implies that the underlying dependency $d_1$ belongs to the set of possible dependencies;
    \item The orientation $\alpha - \gamma$ implies that the underlying dependency $d_1$ is not coherent with the edge orientation and, as such, is not existent in the underlying PARM;
    \item The orientation $\alpha \leftarrow \gamma$ indicates that the underlying dependency $d_1$ is consistent with the edge orientation and belongs to the category of exact dependencies.
\end{itemize}
With this logic, the same propagation property applies to new representations that share the same underlying dependencies because exact and potential dependencies are propagated equally. 


\section{Example Execution of RelFCI}
To illustrate the functioning of the RelFCI algorithm, we provide a step-by-step execution over an example relational causal model. This walk-through demonstrates the graphical transformations applied to the Partial Ancestral Abstract Ground Graph (PAAGG) across the different phases of the algorithm. Each figure referenced corresponds to a visual depiction of the model after the respective step of the algorithm. 

\noindent\textbf{Note:} For this example, we focus on a single perspective (in this case, $AB1$). Similar graphs and reasoning are applied to all other perspectives. Rule propagation ensures that orientations in one PAAGG are reflected across others in line with shared underlying dependencies.

\subsection*{Initial Model and Underlying Graph}

\begin{figure}[h!]
    \centering
    \includegraphics[width=0.8\textwidth]{imgs/UnderlyingGraph.png}
    \caption{Relational causal model with entities, relationships, and dependencies, including latent variables.}
\end{figure}

We begin with a relational causal model that includes observed and latent variables. The figure depicts:

\begin{itemize}
    \item Entities $A$ and $B$ with a relationship $AB1$;
    \item Attributes $A_1, A_2, A_3, B_1, B_2$ (observed), and $B_3$ (latent, represented with a double edges octagon);
    \item Dependencies between relational variables, considering a hop threshold $h=2$:
        \begin{itemize}
            \item Observed dependencies $\in \mathcal{D}_{\boldsymbol{O}}$: 
                    $[A].A_1 \rightarrow [A].A_2$, 
                     $[A, AB1, B].B_2 \rightarrow [A].A_2$,
                     $[A, AB1, B].B_2 \rightarrow [A].A_3$,
                     $[B].B_1 \rightarrow [B].B_2$].
             \item Unobserved dependencies $\in \mathcal{D}_{\boldsymbol{L}}$:
                    $[A, AB1, B].B_3 \rightarrow [A].A_2$,
                     $[A, AB1, B].B_3 \rightarrow [A].A_3$,
                     $[AB1, B].B_3 \rightarrow [AB1].AB1_1$.
        \end{itemize}
\end{itemize}

\subsection*{Phase 0 – PAAGG Construction}

In this phase, the algorithm constructs the PAAGGs with all possible dependencies:

\begin{itemize}
    \item A node is created for each relational variable with a path length up to the hop threshold $h'=2h=4$.
    \item Edges are added according to the $\texttt{extend}$ method, resulting in a fully connected undirected graph with $\circ{-}\circ$ marks.
    \item Intersection variables are included if needed to maintain the closure under intersections. In this example, these variables are excluded from the plots for better readability.
\end{itemize}

The graph in \ref{fig:phase-0} represents the PAAGG with all potential dependencies for the perspective $AB1$.

\begin{figure}[h!]
    \centering
    \includegraphics[width=\textwidth]{imgs/Phase 0 - MAGG persp AB1.png}
    \caption{Fully connected PAAGG for perspective $AB1$.}
    \label{fig:phase-0}
\end{figure}

\subsection*{Phase 1 – Initial Skeleton Identification via Conditional Independence Testing}

The algorithm now performs conditional independence tests between every pair of variables, using increasingly bigger separating sets. If the two variables are found to be independent conditioned on the variables in the separating set, the edge is removed, and the set is stored.

\begin{figure}[h!]
    \centering
    \includegraphics[width=\textwidth]{imgs/Phase 1 - MAGG persp AB1.png}
    \caption{PAAGG after conditional independence testing.}
\end{figure}

Unshielded triples are also identified at this stage as candidate collider patterns. In this example, the following triples are found: 	 
\begin{itemize}
        \item $[AB1].AB1_1, [AB1, A].A_2, [AB1, A].A_1$;
	 \item $[AB1].AB1_1, [AB1, A].A_2, [AB1, B].B_2$;
	 \item $[AB1].AB1_1, [AB1, A].A_2, [AB1, A, AB1, B].B_2$;
	 \item $[AB1].AB1_1, [AB1, A].A_3, [AB1, B].B_2$;
	 \item $[AB1].AB1_1, [AB1, A].A_3, [AB1, A, AB1, B].B_2$;
	 \item $[AB1].AB1_1, [AB1, B, AB1, A].A_2, [AB1, B].B_2$;
	 \item $[AB1].AB1_1, [AB1, B, AB1, A].A_2, [AB1, B, AB1, A].A_1$;
	 \item $[AB1].AB1_1, [AB1, B, AB1, A].A_3, [AB1, B].B_2$.
    \end{itemize}

\subsection*{Phase 2 – Collider Detection and V-Structure Orientation}

This phase introduces the first directed edge orientations in the graph. The algorithm starts by checking whether the unshielded triples are found to be dependent (i.e., for triple $X,Y,Z$, $X,Z$ and $Y,Z$ are not independent given the separating set of $X$ and $Z$) or not. For this example, all 7 unshielded triples are identified as dependent. Then, the CD rule is applied to identify and orient colliders among these triples.
The PAAGG after CD is applied is shown on figure \ref{fig:phase-2}.
\begin{figure}[h!]
    \centering
    \includegraphics[width=\textwidth]{imgs/Phase 2 - MAGG persp AB1.png}
    \caption{PAAGG after collider orientation via CD.}
    \label{fig:phase-2}
\end{figure}


\subsection*{Phase 3 – Further Orientation via RCD and FCI Rules}

In this step, remaining ambiguous edge marks are refined using the additional RCD (RBO, CA, MR3, and KNC) and FCI rules, repeating this process until no rule can be applied anymore. For this example:
\begin{itemize}
    \item Rule KNC is activated once to orient the triple $[AB1, A, AB1].AB1_1 \rightarrowast [AB1, A].A_3 \rightarrow [AB1, B, AB1, A].A_2$ and all other triples sharing the same underlying dependencies;
    \item FCI rule R4 is activated once to orient the triangle $[AB1, A].A_3 \leftrightarrow [AB1].AB1_1 \leftrightarrow [AB1, B, AB1, A].A_2$ and all other triples sharing the same underlying dependencies;
    \item All other rules are not activated.
\end{itemize}

After all rule applications and orientation propagation, the resulting PAAGG (Figure \ref{fig:final} is maximally informative: each remaining $\circ$ mark reflects a true ambiguity in the equivalence class $O\text{-Equiv}(D_O)$.

\begin{figure}[h!]
    \centering
    \includegraphics[width=\textwidth]{imgs/Phase 3 - MAGG persp AB1.png}
    \caption{Final PAAGG with maximally informative edge orientations.}
    \label{fig:final}
\end{figure}

\subsection*{Output – Extraction of Dependencies}

From the oriented PAAGGs, the algorithm extracts the required and possible underlying dependencies. These define the Partial Ancestral Relational Model, shown in Figure \ref{fig:parm}. 

\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\textwidth]{imgs/LearnedModel.png}
    \caption{Learned PARM for the example model.}
    \label{fig:parm}
\end{figure}


\section{Proofs}\label{proofs}

This section contains complete proofs for all the theoretical results presented in the main paper.

\begin{lemma}\label{lemma:gg}
    Given a relational causal model structure $\mathcal{M}$ and perspective $\mathcal{B}$, if an abstract ground graph $AGG_{\mathcal{M}\mathcal{B}}$ is ancestral, then all ground graphs $GG_{\mathcal{M}\sigma}$, with skeleton $\sigma\in\sum_\mathcal{S}$, are ancestral.
\end{lemma}
\begin{proof}
    From the definition of \citet{ZHANG20081873}, a graph is ancestral if:
    \begin{enumerate}
    \item There is no directed cycle, i.e., B$\rightarrow$A is in $G$ and A is an ancestor of B (meaning there's a directed path from A to B);
    \item There is no almost directed cycle, i.e., B$\leftrightarrow$A is in $G$ and A is an ancestor of B;
    \item For any undirected edge A\textemdash B, both A and B have no parent or spouses, i.e., X, Y such that either or both A$\leftrightarrow$X or B$\leftrightarrow$Y.
    \end{enumerate}
For each of the three conditions, we must demonstrate that if the AGG is ancestral, all GGs must likewise be ancestral to prove this lemma. Given the definition of the abstract ground graph building process in Definition 5.2 and Theorem 5.2 from \citet{maier2014reasoning}, we know that the AGG is sound and complete for all ground graphs for a given perspective and hop threshold $h$. This suggests that the AGG captures every dependent path between two variables in every GG. In the same way, each path of dependence between two variables in the AGG is mirrored in at least one GG. We now verify the lemma for the three conditions of ancestrality:
    \begin{enumerate}
        \item Assume that the AGG is ancestral and that one of the ground graphs, $G$, has a directed cycle between $A$ and $B$ to provide a contradiction. Consequently, the two dependence paths in $G$ will also be present in the AGG, resulting in a directed cycle. Thus, the maximal ancestral abstract ground can't be ancestral;
        \item Similar reasoning can be carried when considering almost directed cycles containing double-arrowed edges (in the case of \textit{Maximal Ancestral Abstract Ground Graphs}), thus verifying the lemma for this condition as well;
        \item Given the assumptions of the underlying structure's acyclicity and no selection bias (i.e., no variables are in the set \textbf{S}), an undirected edge cannot exist as it corresponds to the presence of selection variables, of which $X$ and $Y$ are the cause \cite{ZHANG20081873}. Thus, this condition does not apply to AGGs.
    \end{enumerate}
\end{proof}
Lemma \ref{lemma:gg} guarantees that the theoretical reasoning devised for MAGs and PAGs can also be applied to the relational counterparts we provide in this work, MAAGGs, and PAAGGs. In other words, we know that the ancestrality of these relational lifted representations corresponds to the same ancestrality properties in the underlying ground graphs and, thus, in the underlying latent causal relational causal model we want to learn.
\begin{proposition}
    Given a relational causal model $\mathcal{M}_\textbf{L}(\mathcal{S},\mathcal{D})$ with hop threshold $h$, and its respective latent abstract ground graph $LAGG$:
    \begin{enumerate}[label=\Roman*.]
    \item The constructed MAAGG probabilistically and causally represents $LAGG$ and thus the underlying relational causal model; \label{item1} 
    \item Assuming a sound and complete procedure to construct the $PAAGG$, it correctly represents the Markov equivalence class of the produced $MAAGG$ and, therefore, of $LAGG$ and the underlying model $\mathcal{M}_\textbf{L}$.
    \end{enumerate}
\end{proposition}
\begin{proof} 
\begin{enumerate}[label=\Roman*.]
\item We can demonstrate that the MAAGG, constructed from $LAGG$ by employing the same MAG construction procedure provided in \citet{ZHANG20081873}, probabilistically and causally represents it as a result of theorem 4.18 of \citet{Richardson2002AncestralGM}, where they show that the independence model corresponding to the constructed graph coincides with the one obtained by marginalizing and conditioning the model on the original graph ($LAGG$). Furthermore, the MAAGG also represents the model $\mathbfcal{M}_\mathbf{L}$, which follows from Lemma \ref{lemma:gg}.
\item Under the assumption of a sound and complete procedure for generating said representation (i.e., the RelFCI algorithm), the PAAGG represents the Markov equivalence class containing the MAAGG. This proof follows from \citet{ZHANG20081873}: the PAAGG, constructed from a sound and complete algorithm that outputs a set of graphs which includes all the causal relationships consistent across all MAAGGs, accurately represents the equivalence class. This is because it captures the uncertainty (circle marks) where the data does not provide enough information to distinguish between different causal structures. Finally, from \ref{item1}, we can prove that the PAAGG also represents the equivalence class of $LAAG$ and the underlying model $\mathbfcal{M}_\mathbf{L}$.
\end{enumerate}
\end{proof}

\begin{proposition} \label{prop:hop}
Given a latent relational causal model $\mathcal{M}_\textbf{L}(\mathcal{S},\mathcal{D})$ with hop threshold $h$ and its corresponding PARM $\mathbfcal{M}$, the hop threshold $h'$ of the $PAAGG_{\mathbfcal{M}\mathcal{B}}$ for any perspective $\mathcal{B}$ can be at most $2h$.
\end{proposition}
\begin{proof}
Let us consider a scenario within a relational causal model that allows relational latent variables to be observed and in which the non-dependence of these variables holds (i.e., no latent variable causes another latent variable, which entails there cannot exist a chain of dependencies consisting of multiple consecutive latent variables). 
For the sake of clarity, we will focus on three entities, A, B, and C, each containing one attribute, respectively $A1$, $B1$, and $C1$, with $B1$ designated as latent as in Figure \ref{fig:hop}. 
% The observed attributes \{A, C\} may belong to the same entity or to two distinct entities. Likewise, the latent attribute B may belong to the same entity as the other attributes or to a different one. 
% Let's ponder the necessary hop threshold for encompassing a model wherein all three attributes are observed. 
% Suppose we were to distribute them across disparate entities, and thanks to the proper relational dependencies, we connect all of them, using B as the connecting bridge between the other two attributes. 
Suppose we were to connect them, using B1 as the connecting bridge between the other two attributes using the following dependencies: $[A, B].B1 \rightarrow [A].A1$ and $[C, B].B1 \rightarrow [C].C1$, both of which require a hop threshold of one to be represented.
After removing the assumption of having all variables observed, the scenario reverts to one where $B1$ is latent, which means that the dependencies between $B1-A1$ and $B1-C1$ are no longer observable. The possible existing dependencies, containing only relational variables with a path of length two (hop threshold equal to one), make the model unable to express the dependencies among the attributes of different entities, e.g., $[A,B,C].C1\rightarrow[A].A1$ and $[C,B,A].A1\rightarrow[C].C1$.
To account for the relational dependencies between the two entities, we need a relational path that is long enough to traverse the entities and describe the relationship between the variables expressed by the model, which requires twice the original hop threshold of one.
\end{proof}
\begin{figure}[H]
    \centering
    \begin{tikzpicture}
        % Entities
        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm] (A) at (0, 0) {};
        \node[above] at (A.north) {A};
    
        \node[draw, rounded corners, rectangle, minimum width=2cm, minimum height=2cm] (B) at (5, 0) {};
        \node[above] at (B.north) {C};
    
        % Define the four corners of the rhombus
        \coordinate (N) at (2.5, 0.7);
        \coordinate (E) at (3.5, 0);
        \coordinate (S) at (2.5, -0.7);
        \coordinate (W) at (1.5, 0);
        \node[above] at (N.north) {B};
        % \node[above, font=\tiny] at ([xshift=-0.16cm] W.center);
        % \node[above, font=\tiny] at ([xshift=0.16cm] E.center);
    
        % Create attributes inside A
        \node[draw, circle, thick] (A1) at ([xshift=-0.0cm, yshift=-0.0cm] A.center) {A1};
    
        % Create attributes inside B
        \node[draw, circle, font=\tiny, thick, dashed] (B1) at ([yshift=-0.7cm] N.center) {B1};
        
        % Draw the lines to form the rhombus
        \draw (N) -- (E) -- (S) -- (W) -- cycle;
    
        % Create attributes inside C
        \node[draw, circle, thick]  (C1) at ([xshift=-0.0cm, yshift=-0.0cm] B.center) {C1};
      
        % Lines
        \draw (A) -- (W);
        \draw (E) -- (B);
    
        % Curved edge
        \path (B1) edge[bend right, thick, ->, dashed] (A1);
        \path (B1) edge[bend left, thick, ->, dashed] (C1);
        % \path (A1) edge[bend right, thick, <->] (C1);
        % \path (AB1_1) edge[bend left, thick, ->] (A3);
        % \path (A2) edge[bend right, thick, ->] (A1);
        % \path (A2) edge[bend left, thick, ->] (A3);
    \end{tikzpicture}
    \caption{Example of Relational Causal Model with a latent variable}
    \label{fig:hop}
\end{figure}

\begin{theorem}
    Let G be the partially oriented PAAGG from perspective B with the correct set of adjacencies, unshielded colliders oriented correctly through CD and RBO, and as many edges as possible oriented through KNC, CA, MR3, and the purely common cause of RBO. Then, the rules R4-R10 from FCI and the orientation propagations are sound.
\end{theorem}
\begin{proof}
Given lemma \ref{lemma:gg}, the proof derives from \citet{spirtes1995causal} and \citet{ZHANG20081873}. A rule is sound if the arrows and tails used in the resulting PAAGG are invariant. Therefore, we need to prove that any mixed abstract ground graph $G$ that violates a rule does not belong to the equivalence class \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$), that is, it is not ancestral or Markov equivalent to the original MAAGG.
The proof for rule R4 is identical to the proof by induction provided in \citet{spirtes1995causal}, stating that by applying iteratively rule R4 on a PAAGG $G$ oriented using rules CD, CA, KNC, and MR3, the resulting graph $G_i$ at each iteration $i$ maintains its ancestral properties for the equivalence class \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$). The proof for the remaining rules is taken from \citet{ZHANG20081873}:
\begin{itemize}
\item R5: The rule states that the path $p = \langle\alpha, \gamma,...,\theta,\beta,\alpha\rangle$ consists of an uncovered cycles of only circle marks. If we assume instead that a graph $G$ has an arrowhead on this cycle because of KNC, this cycle must be directed to avoid unshielded colliders. But by doing so, the graph is not ancestral;
\item R6: Any graph $G$ that contains the opposite orientation than the one stated by the rule, i.e., $\alpha$\textemdash$\beta\leftarrow\ast\gamma$, is not ancestral;
\item R7: Supposed that a graph $G$ has an arrowhead into $\beta$ as opposed to the rule. Therefore, the triple can be oriented as $\alpha$\textemdash$\beta\leftarrow\ast\gamma$ or $\alpha\rightarrow\beta\leftarrow\ast\gamma$. In the former case, $G$ is not ancestral. In the latter, it contains an unshielded collider not present in the original MAAGG;
\item R8: If a graph $G$ instead of $\alpha\rightarrow\gamma$ contains $\alpha\leftrightarrow\gamma$, then there is an almost directed cycle or an arrowhead into an undirected edge. In both cases, the graph is not ancestral;
\item R9: The same proof for R5 can be applied for this rule;
\item R10: The rule states that $\langle\mu,\alpha,\omega,\rangle$ is not a collider in the original MAAGG. Assume that a graph $G$ in the equivalence class contains $\alpha\leftrightarrow\gamma$ instead of the rule specification. Then, for $G$ to be ancestral, one or more edges out of $\alpha$ must be directed. Therefore, to avoid unshielded colliders not in the original MAAGG, $p_1$ or $p_2$ must be a directed path, making $alpha$ an ancestor of $gamma$ and thus $G$ not ancestral. 
\end{itemize}
Finally, considering that the rules are proven sound and, as such, all orientations produced are correct, it is straightforward to prove that the respective orientation propagation procedure is sound, following from \cite{maier2013sound}.
\end{proof}
The following two lemmas for the arrowhead and tail completeness make use of a representation defined as \textit{chordal graph}, established in \citet{meek1995causal} and extended in \citet{maier2013sound} for relational data. This representation is an undirected graph where every undirected cycle of length four or more has an edge between two nonconsecutive vertices on the cycle. In chordal graphs, a total order $\alpha$ is consistent with respect to $AGG$ if and only if $AGG_\alpha$ (abstract ground graph in which $A\rightarrow B$ if and only if $A<B$ with respect to $\alpha$) has no unshielded colliders. Furthermore, for all adjacent vertices $A$ and $B$, there exists consistent total orderings $\alpha$ and $\gamma$ such that $A\leftarrow B\in AGG_\alpha$ and $A\rightarrow B\in AGG_\gamma$. 

\begin{lemma}
Let G be a partially oriented PAAGG with correct adjacencies. Then, exhaustively applying CD, RBO, KNC, CA, MR3, and R4, all with orientation propagation of edges, produces a PAAGG G' in which for every circle mark there exists a MAAGG in the \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$) class with a corresponding tail mark.
\end{lemma}

\begin{proof}  
The proof follows from Theorem 4.3 of \citet{ali2012towards}. They prove arrowhead completeness for a different graph representation for the Markov equivalence class of MAGs, \textit{Joined Graphs}, which do not distinguish between tail marks and circle marks, provided that the work focused explicitly on arrowhead edge orientations. The same reasoning can be used to ancestral graphs and, with Lemma \ref{lemma:gg}, to PAAGGs. Let $G'$ be the PAAGG with as many edges orientated using CD, RBO, CA, MR3, and R4. For these proofs, we define the edge marker $\otimes$, which corresponds to either a circle or edge mark. There are four steps to prove the arrowhead completeness: 
\begin{enumerate}
    \item Removing any non-directed edge in $G'$ creates a disjoint union of maximal ancestral PAAGGs. Assume for contradiction that the graph $G^*$ obtained by removing undirected edges is not ancestral. Given that $G^*$ does not contain undirected edges, it cannot contain the following configurations: $A\otimes\rightarrow B$\textemdash$C$ or $A\ast\rightarrow B$\textemdash$C$\textemdash$D\rightarrow A$. Therefore, it contains a partially directed k-cycle such as $X\ast\rightarrow Y\rightarrow ... \rightarrow Z\rightarrow X$. It can be easily proven that no such cycle can exist without contradiction for $k\geq 3$; therefore, $G^*$ is both ancestral and maximal (Lemma 4.1 of their work that proves that the oriented $G'$ contains only triangles with the following forms: \\
    (i) $B\rightarrowast A\leftarrowast C \ast$ \textemdash $\ast B$; (ii) $B \ast$ \textemdash $A$ \textemdash $\ast C \ast$ \textemdash $\ast B$; or (iii) $Y\rightarrowast A $\textemdash $\ast C \leftarrowast B$).
    \item No replacement of the undirected edges in $G'$ by directed edges will result in non-ancestral structures such as partially directed cycles, unshielded colliders, colliders with order, or inducing paths with non-adjacent endpoints that include an edge oriented by the orientation rules. The absence of these non-ancestral structures is a direct consequence of Lemma 4.1. \label{item2}
    \item By removing all directed edges and undirected ones with no parents or spouses from $G'$, the resulting AGG $U$ is a disjoint union of chordal undirected graphs. Assume for contradiction that the orderings of $U$ lead to unshielded colliders. From \ref{item2}, we know that a replacement of undirected edges could generate a collider with order or inducing paths with non-adjacent endpoints. It's also possible to prove by contradiction that if $U$ is not chordal, then the subgraph $U'$ of the partially oriented PAAGG corresponding to $U$ must contain the same non-chordal properties (i.e., unshielded colliders), which is not possible as $U'$ cannot contain an unshielded collider given the orientation provided by the CD rule. Therefore, $U$ must be chordal.
    \item By definition of chordal graph, for every pair $(A,B)$ there are at least two orderings such that $A\rightarrow B$ in one and $A\leftarrow B)$ in the other. Therefore, $G'$ is maximally oriented, and as such, the rules  CD, CA, MR3, and R4 are arrowhead complete. 
\end{enumerate}
\citet{maier2013sound} demonstrates the completeness of the merely common cause rule of RBO, which establishes edge orientation through arrowhead marks only. Consider again the PAAGG $G'$. Assume by contradiction that there's an edge in $G'$ with a circle mark (without loss of generality, $A\rightarrowcircle B$), such that there are no MAAGGs in \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$) with a corresponding tail mark for that edge. This requires that the edge mark correspond to an arrowhead in both the equivalence class and the generated PAAGG. Based on the completeness proofs provided above, one of the rules would have orientated that edge mark with an arrowhead. As a result, there must be a MAAGG in \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$) that has the edge $A\rightarrow B$, also known as a tail mark. 
\end{proof}

\begin{lemma}
Let G' be the partially oriented PAAGG with correct adjacencies and unshielded colliders, and as many edges oriented with KNC, CA, and MR3, all with orientation propagation. Then, applying rules R5-R10, together with orientation propagation, produces a PAAGG G'' such that for every circle mark, there exists a MAAGG in \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$) in which the corresponding mark is an arrowhead.
\end{lemma}
\begin{proof}
Using Lemma \ref{lemma:gg}, we may follow \citet{ZHANG20081873} tail completeness proof. We show that any PAAGG edge with a ◦ mark (e.g., ◦\textemdash, ◦\textemdash◦, ◦$\rightarrow$) corresponds to an arrowhead in a MAAGG in the equivalence class. 
% Zhang (2008) established and proved various theorems and lemmas that can be used as findings of Lemma \ref{lemma:gg}. 
For the first two types of edges (◦\textemdash, ◦\textemdash◦), we make use of some properties of PAGs, proven in \citet{ZHANG20081873} and adapted to PAAGGs:
\begin{enumerate}[label=\textbf{P}\arabic*]
    \item Given a triple A, B, C in a PAAGG, if $A\rightarrowast B\circleast C$, then there is an edge $A\rightarrowast C$. In addition, if $A\rightarrow B$, then the edge between A and C cannot be $A\leftrightarrow C$;
    \item Given two vertices, A and B, in a PAAGG, if $A$\textemdash◦$B$, then there is no edge into A or B;
    \item Given a triple A, B, C in a PAAGG, if $A$\textemdash◦$B\circleast C$, then there is an edge between A and C. Furthermore, if $A$\textemdash◦$B$◦\textemdash◦$C$, then the edge between A and C is $A$\textemdash◦$C$; if $A$\textemdash◦$B\rightarrowast C$, then either $A\rightarrow C$ or $A\rightarrowast C$;
    \item Given two vertices, A and B, in a PAAGG, if $A$\textemdash◦$B$, then there is no cycle with the following structure $A$\textemdash◦$B$\textemdash◦$...$\textemdash◦$A$.
\end{enumerate}
With these properties, it can be proven that:
\begin{itemize}
    \item For every edge $A$◦\textemdash◦$B$ in the subgraph obtained by keeping only ◦\textemdash◦ edges from the PAAGG (which we denote as $P^C_{AAGG}$), the subgraph can be oriented into two DAGs without unshielded colliders such that $A\rightarrow B$ in one and $A\leftarrow B$ in the other. This is proven by showing that $P^C_{AAGG}$ is chordal: assume by contradiction that there is a non-chordal cycle $\langle X, Y, W, ..., Z \rangle$. This implies that any non-consecutive vertices in the cycle are not adjacent in either $P^C_{AAGG}$ or the original PAAGG, as otherwise they would be connected by a ◦\textemdash◦ edge (deriving from \textbf{P}1 and \textbf{P}3) and as such connected in the $P^C_{AAGG}$ as well. Therefore, this non-chordal cycle also appears in the PAAGG, which should have been oriented with rule R5. Therefore the $P^C_{AAGG}$ is chordal.
    \item Let $H$ be the graph obtained from the following steps applied to the PAAGG:
    \begin{enumerate}
        \item orient all $\rightarrowcircle$ and \textemdash◦ edges into directed ones, i.e., $\rightarrow$;
        \item orient the $P^C_{AAGG}$ into a DAG with no unshielded collider.
    \end{enumerate}
    Then $H$ belongs to the equivalence class represented by the PAAGG: \\
    \textbf{P}1-4 ensure that no directed or almost directed cycle is generated after the first step. For step 2, \textbf{P}1 and \textbf{P}3 ensure that in the $P^C_{AAGG}$ no new directed or almost directed cycles will be generated in $H$, and furthermore, no new edge into any vertex incident to undirected edges and no inducing paths between any non-adjacent vertices appear. This verifies that $H$ is ancestral and maximal. It is easy to prove then that $H$ belongs to the equivalence class as \textbf{P}1-3 guarantee that no new unshielded colliders are created, and as no new bi-directed edges are created also the discriminating path condition for Markov equivalence between $H$ and the PAAGG is verified.
\end{itemize}
These two theoretical conclusions guarantee that no circle on a PAAGG's ◦\textemdash and ◦\textemdash◦ edges corresponds to an invariant tail. The proof for the ◦$\rightarrow$ edge comes from Theorem 3 in \citet{ZHANG20081873}, which uses the chordal graph representation established in \citet{meek1995causal} and extended in \citet{maier2013sound} for relational data. 
For the PAAGG $ G''$, a proof by contradiction similar to the one provided in Lemma 2 can be carried out for every circle mark corresponding to an arrowhead in at least one MAAGG in the equivalence class \textit{O-Equiv}($\mathcal{D}_{\textbf{O}}$).
\end{proof}

\setcounter{theorem}{2}
\begin{theorem}
Given a schema and a probability distribution P(\textbf{V}) with $\textbf{V}=\textbf{O}\cup\textbf{L}\cup\textbf{S}$, the output of RelFCI is a correct maximally informative PAAGG, and thus a maximally informative PARM $\mathbfcal{M}$, assuming perfect conditional independence tests and sufficient hop threshold $h'$.
\end{theorem}
\begin{proof}
The following proof sketch is adapted from \citet{maier2014reasoning}. Given a sufficient $h'$ at least equal to $2h$ (Proposition \ref{prop:hop}), the set of potential dependencies $PDs$ includes all true dependencies that generate the respective $MAAGG$, which implies the generation of the correct adjacencies, which include the true causes for each relational variable. The unoriented PAAGGs are then constructed using the procedure from \citet{maier2014reasoning}. Assuming perfect conditional independence tests, the algorithm maintains only the correct edges for the PAAGGs. $S$ and $U$ also contain the correct separating sets for every pair of nonadjacent variables and the true unshielded colliders. Next, RelFCI orients all unshielded colliders using either CD or RBO and then, finally, produces a maximally informative PAAGG $G$ and PARM $\mathbfcal{M}$ as an implication of Theorem 1 and Theorem 2.
\end{proof}

\section{Additional Results}\label{res}

We further evaluated the performance of RelFCI in the absence of latent variables to establish a fair comparison with RCD under causal sufficiency. The experimental setup mirrors that described in Section \ref{sec:setup}, and the results are presented in Figure \ref{fig:no-lat}. As shown, RelFCI achieves precision and recall comparable to, and in some configurations slightly exceeding, those of RCD. These results demonstrate that RelFCI maintains high accuracy even when latent confounders are not present, confirming its soundness in recovering the true causal structure in standard relational settings.
\begin{figure}[ht]
    \centering
    \includegraphics[width=\textwidth]{imgs/plot_nolatvar.png}
    \caption{RelFCI Precision and Recall performance with no latent variables.}
    \label{fig:no-lat}
\end{figure}

\end{appendix}