

The \emph{moments} of random variables have been fundamental statistical measures since their introduction by Pafnuti Lvovich Chebyshev in the mid-nineteenth century \citep{Mackey1980}.
The $m$-th moment of a random variable $Y$ is defined as $\mathbb{E}[Y^m]$, and the $m$-th central moment is defined as $\mathbb{E}[(Y-\mathbb{E}[Y])^m]$.
These moments characterize the shape of a random variable’s probability distribution, encompassing measures such as mean, variance, skewness, and kurtosis \citep{Pearson1896,Joanes1998,Cramer1999,Doane2011,Hippel2005,Westfall2014}.
%The moments of random variables have been extensively studied in prior research \citep{Rosenblueth1975,Hong1998} and are commonly covered in modern statistical textbooks \citep{Casella2002,Shao2008,Hogg2013,Heumann2016}.
The (central) moments of random variables also play a fundamental role in various machine learning techniques \citep{Bishop2006,Hastie2009,Murphy2022}.




On the other hand, the primary focus of causal inference is the evaluation of causal effects  $Y_1 - Y_0$, where $Y_x$ denotes the potential outcome under treatment $X = x$, rather than a single random variable $Y$.
Traditionally,  to assess causal effects, researchers  estimate the average causal effect (ACE), i.e., $\mathbb{E}[Y_1 - Y_0]$, which represents the first moment of causal effects \citep{Neyman1923, Rubin1978, Holland1986, Balke1997, Robins1999}.



Recently, there has been increasing interest in exploring aspects of causal effects beyond their average, particularly in the distributional properties of causal effects \citep{Ju2010, Wiedermann2022, Lin2023, Kennedy2023b, Post2023}.
%and the heterogeneity of causal effects \citep{Athey2016, Shalit2017, Athey2019, Kunzel2019, Wager2018, Singh2023, Kawakami2024b}, across the fields of statistics, machine learning, and causal inference.
{The shape of the distribution of causal effects uncovers causal effect heterogeneity, which is an actively researched topic in the field of statistics, causal inference, and machine learning \citep{Athey2016, Shalit2017, Athey2019, Kunzel2019, Wager2018, Singh2023, Kawakami2024b}.
Causal effect heterogeneity refers to the variation in causal effects across individuals or subgroups within a population.
Existing works on causal effect heterogeneity mainly examine the conditional average causal effects (CACE), i.e., $\mathbb{E}[Y_1-Y_0|W=w]$, based on subjects’ covariates $W$.
However, CACE captures only the heterogeneity across subpopulations specified by observed covariates $W$, not the heterogeneity across individuals.
In contrast, the shape of the distribution of causal effects reveals the heterogeneity of causal effects across individuals and provides complementary information to CACE.}



Our objective is to address %this gap by studying the moments of causal effects using a nonparametric approach.
the following causal question:
\begin{center}
(\textbf{Question 1}).
``{\it
How are causal effects distributed?
}"
\end{center}
We approach this question by studying the moments of causal effects $\mathbb{E}\Big[(Y_1-Y_0)^m\Big]$. These moments serve as measures that characterize the shape of the distribution of causal effects. Furthermore, we examine the central moments %of the causal effect $Y_1 - Y_0$, defined as
$\mathbb{E}\Big[\Big\{(Y_1-Y_0)-(\mathbb{E}[Y_1]-\mathbb{E}[Y_0])\Big\}^m\Big]$.
%\begin{equation}
%\mathbb{E}\Big[\Big\{(Y_1-Y_0)-(\mathbb{E}[Y_1]-\mathbb{E}[Y_0])\Big\}^m\Big].
%\end{equation}
These moments quantify deviations from the ACE.
They encompass key statistical measures such as variance, standard deviation, skewness, and kurtosis, which are fundamental for characterizing the shape of the distribution of causal effects.
While previous work has examined  the second central moment (variance) of causal effects \citep{Heckman1997,Hernan2024}, this work provides a general analysis by studying arbitrary moments of causal effects.  


% and in this paper, we study the (central) moments of causal effects in a general form.}
%The $m$-th moment of causal effects $Y_1-Y_0$ is defined as
%\begin{equation}
%\mathbb{E}\Big[(Y_1-Y_0)^m\Big].
%\end{equation}
%These moments serve as measures that characterize the shape of the distribution of causal effects. Moreover, the moments of causal effects can provide valuable insights for addressing 

Several studies \citep{DiNardo1996,Robins2001,Rubin2006,Jung2021,Kennedy2023d,Kim2024} aim to estimate the probability density function (PDF) of $Y_x$. 
However, identifying the moments of causal effects requires the joint distribution of $(Y_0, Y_1)$. 
%Identifying the moments of causal effects is challenging because the joint distribution of $Y_1$ and $Y_0$ is never observed \citep{Holland1986, Hernan2024}. 
The joint distribution of potential outcomes has been explored in the framework of probabilities of causation (PoC) \citep{Pearl1999, Tian2000, ALi2024}. 
We identify the (central) moments of causal effects  by leveraging the recent identification results for variants of the PoC established by  \citet{Kawakami2024}. 
%Recently, \citet{Kawakami2024} established  identification results for several variants of the PoC.  % for a continuous outcome under the assumptions of exogeneity and monotonicity. Building on these assumptions, we demonstrate the identification of 
Additionally, we derive bounds for the (central) moments of causal effects under relaxed assumptions. %without relying on their monotonicity assumptions.

We further address the following causal question:
\begin{center}
(\textbf{Question 2}).
``{\it
How are two causal effects related?
}"
\end{center}
Researchers often consider more than two treatment options and compare multiple potential outcomes, $\{Y_1,Y_2,\dots,Y_R\}$,  %including cases with more than three, 
as discussed in 
\citep{Bartholomew1959, Page1963, Imbens2000, Imai2004}.
%We consider the comparison of the $R$ potential outcomes $\{Y_1,Y_2,\dots,Y_R\}$ with $\Omega_X=\{1,\dots,R\}$. The causal effect of $X=i$ and $X=j$ is given by $Y_i-Y_j$ and the causal effect of $X=h$ and $X=k$ is given by $Y_h-Y_k$.
%The concept of correlation plays a fundamental role in statistics, machine learning, and causal inference. % \citep{Galton1888, Yule1897, Fisher1915, Isserlis1918, Pearson1920, Wright1921, Wishart1928, Wright1934, Pitman1939, Wright1960, Stigler1989, Aldrich1995, Pearl09}.
We then investigate the \emph{product moments} of causal effects %, defined as
%\begin{equation}
$\mathbb{E}\Big[(Y_i-Y_j)(Y_k-Y_h)\Big]$,
%\end{equation}
as well as the central product moments (covariance and correlation) of causal effects,   %given by
%\begin{equation}
%\begin{aligned}
%&\mathbb{E}\Big[\Big\{(Y_i-Y_j)-(\mathbb{E}[Y_i]-\mathbb{E}[Y_j])\Big\}\\
%&\hspace{1cm}\times\Big\{(Y_k-Y_h)-(\mathbb{E}[Y_k]-\mathbb{E}[Y_h])\Big\}\Big],
%\end{aligned}
%\end{equation}
where $Y_i-Y_j$ represents the causal effect of changing $X=j$ to $X=i$, and $Y_k-Y_h$ represents the causal effect of changing $X=h$ to $X=k$. 
%\emph{Product moments} of random variables are another important class of moments, defined as the expectation of the product of two random variables. They are commonly used to define Pearson correlation coefficients \citep{Pearson1905, Rodgers1988}. 
The product moments of causal effects reveal the association between two causal effects.
When $\mathbb{E}[(Y_i-Y_j)(Y_k-Y_h)]$ is positive, subjects with larger $Y_i-Y_j$ tend to have larger $Y_k-Y_h$.
When it is negative, subjects with larger $Y_i-Y_j$ tend to have smaller $Y_k-Y_h$.
When it is zero, there is no linear relationship between $Y_k-Y_h$ and $Y_i-Y_j$. 
The product moments of causal effects provide additional insights beyond the ACE $\mathbb{E}[Y_i-Y_j]$ and $\mathbb{E}[Y_k-Y_h]$.

%We establish an identification theorem for the (central) product moments of causal effects under the assumptions of exogeneity and monotonicity. Additionally, we derive bounds for the (central) product moments of causal effects without relying on the monotonicity assumption.
We establish identification theorems for the (central) product moments of causal effects and derive bounds for them under more relaxed assumptions. 
%Finally, we present estimation methods for the (central) product moments of causal effects, conduct experiments to illustrate their properties, 
Finally, we conduct experiments estimating the (product) moments of causal effects from finite samples and demonstrate their practical application using a real-world medical dataset. % \citep{Westfall2011}.

