% !TEX root =  ../main.tex
\section{Introduction}\label{sec:introduction}
\emph{\enquote{The Book of Why}} \cite{pearl2018book} elects \emph{causality} as the key to overcoming the limit of purely predictive artificial intelligence 
\blfootnote{Authors' contributions breakdown in back matter. Correspondence to: $\text{<gabriele.dacunto@uniroma1.it>}$ and $\text{<cbattiloro@hsph.harvard.edu>}$.}{(AI)}.
This argument recently found mathematical support in the work of \citet{richens2024robust}, who gave evidence that robustness to distributional shifts of AI agents is conditioned on the learning of an approximate subjective causal model.
The \emph{structural causal model} (SCM) framework \cite{pearl2009causality} is a gold standard for modeling \emph{cause-effect} relationships in complex systems.
Informally, a (probabilistic) SCM is made by causal variables and variable-specific sources of noise, together with structural equations -- to be read as assignments, like in physics -- determining how each variable is causally influenced by others (e.g., the test score of a student depends on hours of study, motivation, and some randomness).
An SCM induces the so-called \enquote{ladder of causations} \cite{pearl2018book}: \faIcon{eye} the \emph{associational} layer related to factual information (\emph{seeing}), \faIcon{hammer} the \emph{interventional} layer related to the effects of actions (\emph{doing}), and \faIcon{brain} the \emph{counterfactual} layer related to imagine the effect of an action, given that something else occurred (\emph{retrospection}). 
The ultimate goal of \emph{causal AI} is to empower AI systems with such a ladder of causation for robust and trustworthy decision-making.
This work pushes in the same direction.\\
\noindent\textbf{Motivation.} We are driven by a philosophically simple yet technically unexplored concept: \textit{any causal model is an imperfect and subjective representation of the world, and it cannot be severed from the network of relations the subject is immersed in}. As such, our work somehow starts where \citet{richens2024robust} end, as they provided a (specific) formalization of the subjectiveness of causal models without, however, formalizing their dependency on the subject's relationships. In philosophy, this concept has been extensively debated: manipulability theories view causes as \enquote{handles} for effecting change, tying them to subjective agency \cite{woodward2001manipulability}; pluralist approaches suggest multiple, context-dependent concepts of causation \cite{psillos2010causalpluralism}; the actor-network theory of Latour and others posits that everything in the social and natural worlds exists in constantly shifting networks of relationships, and nothing exists outside those relationships \cite{latour2007actortheory}. A motivating example for our work is \emph{agentic AI}, a frontier paradigm pushing for autonomous AI agents -- the subjects -- to collaborate in solving complex tasks. 
However, in our work, the notion of \enquote{subject} takes on a broader meaning and could represent, for example, the resolution by which a phenomenon is studied; the sensor that detects pollutants in a specific geographical area, different from that of other sensors; the trading book of a bank seen as a proxy for investment strategy.
In all of the above cases, we advocate that each subject likely -- and, arguably, hopefully -- develops a subjective SCM, these SCMs are interconnected, and their interplay can benefit the subject and the entire network.
Inherently, this paradigm, which we term \emph{relativity of causal knowledge} and whose core object is \emph{relative causal knowledge} (RCK), is tied to the concept of \emph{perspective}: subjects cannot interact by detaching from their world representation. In our framework, asking for a unique \enquote{true} causal description of a system is an ill-posed question, as the very same notion of causality is inherently relative. However, it is important to highlight that the relativity of causal knowledge is different from the \textit{relativism} (in its philosophical meaning) of causal knowledge: \textit{we do not undermine the meaning of things, we question its description as a monolithic object}.\\
\noindent\textbf{Related Works.} RCK has its roots in \emph{category theory} \cite{mac2013categories} and \emph{sheaf theory} \cite{serre1955faisceaux}, and is inspired by Grothendieck's notion of relativism in the context of mathematics \cite{mclarty2003rising}.
Grothendieck revolutionized the understanding of mathematical structures by shifting the focus from individual objects to the relationships between them, as expressed through morphisms. 
This perspective led to a flexible, contextual view of mathematical spaces. 
We draw a parallel to Grothendieck's paradigm shift by treating SCMs as inherently contextual and interconnected rather than isolated entities. 
Just as Grothendieck’s use of sheaf theory facilitates local-to-global transitions in mathematics, we employ network sheaves--first-order cellular sheaves \cite{curry2014sheaves}-- to investigate how causal knowledge, i.e., a set of interventional and observational probability measures, is transferred and transformed across different subjects and their perspectives.
Previous work focused on a functorial characterization of SCM through mappings between the causal structure and the (discrete) distributions of causal variables \cite{jacobs2019causal}.
Conversely, we provide a category-theoretic treatment of SCMs -- and related interventions -- focusing on morphisms between probability spaces, linking the SCM to the category of convex spaces of probability measures \cite{fritz2009convex}. 
Similar definitions to \Cref{def:SCM_meas,def:scm_fun} appear in the concurrent work \cite{d2025causal}.\\
Relations among SCMs have been investigated over time.
The \emph{transportability} problem \cite{pearl2011transportability,bareinboim2016causal} addresses the transfer of causal knowledge from one environment (the source) to another (the target) under assumptions on \emph{(i)} the knowledge of the underlying causal structure, typically represented by causal graphs, and \emph{(ii)} the types and targets of intervention. 
In \emph{causal transfer learning} \cite{zhang2015multi,rojas2018invariant,magliacane2018domain}, several studies tackle the challenge of transferring causal knowledge from source domains to enhance performance in a target domain (i.e., domain adaptation). 
Next, \emph{equivalence} of SCMs \cite{,beckers2021equivalent,verma2022equivalence} aims at identifying equivalent (sub)structures to provide insights into how different systems share common causal relationships.
Particularly relevant to our work is the theory of \emph{causal abstraction} (CA) \cite{rubenstein2017causal,beckers2019abstracting}.
CA formalizes the mappings between SCMs describing the same system at different levels of granularity.
In this paper, we will work under the \alphaabs framework proposed by \citet{rischel2020category}.
The latter is convenient to us since \emph{interventional consistency} (IC) is neatly separated from the definition of the CA.\\
\begin{figure}[t!]
    \centering
    \includegraphics[width=1\linewidth]{figures/gabstract_last.png}
    \caption{The \emph{relativity of causal knowledge} states that causal knowledge (CK) is subjective and interconnected rather than objective and isolated. 
    Multiple subjects of/in the same system will develop multiple and different instances of CK describing the system. Informally, CK can be seen as a set of probability measures corresponding to \faIcon{eye} seeing, \faIcon{hammer} doing, and imagining \faIcon{brain}.
    The CK $\mathsf{CK}^{\rho}$ of subject $\rho$ is fully accessible exclusively by $\rho$. 
    As such, another subject $\sigma$ can only access the \emph{relative causal knowledge} (RCK) $\mathsf{CK}^{\rho, \sigma}$, i.e., the CK of $\rho$ from the perspective of $\sigma$. 
    There is a link $\tau$ between $\rho$ and $\sigma$, i.e., they can communicate, if their CK admits a common interventionally consistent CA acting as \emph{backbone space}. 
    As such, the RCK $\mathsf{CK}^{\rho, \sigma}$ is obtained by first transporting $\mathsf{CK}^{\rho}$ on $\tau$, obtaining a more abstract $\mathsf{CK}^{\rho, \tau}$, and then  $\mathsf{CK}^{\rho, \tau}$ on $\sigma$. 
    Interestingly, subjects that are not directly connected can still access some RCK if there is a path of links among them, but it would be first \enquote{filtered} by the perspective of all the other subjects on the path. 
    We elegantly implement RCK using a category-theoretic approach resulting in novel mathematical objects: \emph{The network sheaf and cosheaf of causal knowledge.}}
    \label{fig:enter-label}
\end{figure}
\noindent\textbf{Contributions.}
A number of results are established to pose the formal definition of RCK.
\emph{First}, we introduce the category of SCMs, viz. \SCMcat, whose objects are SCMs -- expressed as functors -- and morphisms are natural transformations.
We further characterize hard \cite{pearl2009causality} and soft \cite{eberhardt2007interventions} interventions in the latter category, proving that the set of entailed observational and interventional probability measures is closed under a convex combination operation (cf. \Cref{th:convex_comb_prob_meas}).
\emph{Second}, aiming at IC, we recast the \alphaabs in \SCMcat, proving that IC CA morphism corresponding to endogenous variables is valid in the category of convex spaces of probability measures, viz. \CSprob (cf. \Cref{th:ca_affine_functions}).
Hence, we establish the existence of a functor encoding non-intervened SCMs into objects in \CSprob, representing the SCMs' causal knowledge.
\emph{Third}, leveraging the encoding functor, we define network sheaf and cosheaf of causal knowledge, finally posing the formal definition of RCK.\\
\noindent\textbf{Impact.}
Our proposed network sheaves and cosheaves are particular instances of more general network sheaves and cosheaves in \CSprob. 
Our work ultimately emphasizes their relevance to applications in (but not limited to) AI/ML. 
In particular, RCK and network sheaf and cosheaf of causal knowledge represent new objects to be investigated, adding to the cellular sheaves valued in Abelian categories, such as those on vector spaces, and Hilbert spaces \cite{curry2014sheaves,hansen2019toward}, and the non-Abelian case of sheaves of lattices \cite{ghrist2022cellular}. 
Overall, our work opens three major areas of research: \emph{cohomology theory}, \emph{Hodge(-like) theory}, and \emph{learning theory} for network sheaves and cosheaves of causal knowledge. \\
\noindent\textbf{Notation.}
Sets and collections are uppercase calligraphic, $\mathcal{A}$. 
The set of integers from $1$ to $n$ is $[n]$.
Given a scalar $a$, we denote by $\bar{a}=1-a$.
A set \myexogenousvals equipped with a $\sigma$-algebra $\Upsilon$ gives a measurable space $(\myexogenousvals, \Upsilon)$.
A measurable space $(\myexogenousvals, \Upsilon)$ together with a probability measure $\mu$, i.e., such that $\mu(\myexogenousvals)=1$, gives a probability space $(\myexogenousvals, \Upsilon, \mu)$.
Given two measurable spaces $(\myexogenousvals, \Upsilon)$ and $(\myendogenousvals, \Omega)$, a measurable map $\varphi:\myexogenousvals \rightarrow \myendogenousvals$, and a probability measure $\mu$ over $(\myexogenousvals, \Upsilon)$, we denote by $\varphi(\mu)\coloneqq \mu \,\circ\, \varphi^{-1}$ the pushforward measure over $(\myendogenousvals, \Omega)$.
The domain of a function is $\mathcal{D}[]$. 
The indicator function is $\indicatorf{\mathcal{A}}{A}$, $1$ if $A \in \mathcal{A}$, $0$ otherwise.
\begin{remark}
A less technical but comprehensive description together with practical examples can be found in Appendix \ref{app:disc&ex}. This description is designed to be useful either before or after the reader has parsed the main body of the paper.
\end{remark}