\section{Introduction}
The goal of causal discovery is to reveal causal information by analyzing observational data. 
Most causal discovery algorithms assume that the data is independent and identically distributed (i.i.d.), and that the data generation is based on a directed acyclic model \citep{heinze2018causal}. 
However, many real-world data sources, including biological and social networks, do not meet the i.i.d. assumption and contain entities which interact with each other and exhibit causal dependencies among their attributes. 
To capture such dependencies and enable causal reasoning in relational data, more expressive classes of directed graphical models \citep{maier2014reasoning,lee2016,ahsan2022non} and algorithms for relational causal discovery \citep{maier2013sound,lee2016,lee2020,ahsan2023learning} have been developed over the past decade.

Existing relational causal discovery algorithms rely on the strong assumption of causal sufficiency, i.e., all common causes of observed variables have been measured and included in the data. 
However, this assumption rarely holds for real-world data where the presence of latent confounders can invalidate the causal discovery and causal effect estimation processes. This is especially true in relational domains where capturing latent confounders in causal models is key to separating homophily-based correlations from contagion \citep{shalizi-smr11,lee-jasa21}. While multiple algorithms exist for causal discovery with latent confounders in i.i.d. data (e.g., \citet{spirtes2000causation,colombo2012learning}), none address relational data. 
To facilitate more realistic causal discovery in relational domains, it is necessary to formalize latent confounders in relational causal models and lift the assumption of causal sufficiency. 

In this work, we introduce novel graphical models and a novel relational causal discovery algorithm, RelFCI, that can capture latent confounders in relational data. We build upon the representations and algorithms for Fast Causal Inference (FCI) \citep{spirtes2000causation} and Relational Causal Discovery (RCD) \citep{maier2013sound}, neither of which is sufficient on its own. FCI performs causal discovery with latent confounders but does not address relational data, whereas RCD performs relational causal discovery through relational \textit{d}-separation but assumes causal sufficiency. We introduce new relational graphical models, \textit{Latent Relational Causal Models} (LRCMs), \textit{Maximal Ancestral Abstract Ground Graphs} (MAAGGs), and \textit{Partial Ancestral Abstract Ground Graphs} (PAAGGs), and provide a set of assumptions necessary for causal discovery with latent variables on relational causal models. These models address the unique challenges of relational data, such as variable construction across relational paths and partial observation of entities. We then show that with these new models and under our specified assumptions, the rules of FCI, combined with the rules of RCD and applied to the PAAGGs, yield a sound and complete procedure for relational causal discovery. Specifically, we prove soundness and completeness guarantees of RelFCI up to a bounded hop threshold in the presence of latent variables. We demonstrate the algorithm's correctness on experimental datasets, comparing it to existing algorithms.
