\documentclass{uai2025} % for initial submission
%\documentclass[accepted]{uai2025} % after acceptance, for a revised version; 
% also before submission to see how the non-anonymous paper would look like 
       

\newcommand{\removed}[1]{}
\usepackage{times}
\usepackage{soul}
\usepackage{url}
%\usepackage{hyperref}
\usepackage[utf8]{inputenc}
%\usepackage[small]{caption}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{booktabs}
\usepackage{algorithm}
\usepackage{algorithmic}
%\usepackage[switch]{lineno}
\usepackage{stackengine}
\def\defeq{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}}


\usepackage{algorithm}
\usepackage{algorithmic}

% Set the typeface to Times Roman
\usepackage{times}

%\usepackage{hyperref}
\usepackage{url}

\usepackage{amsmath}
%\usepackage{wrapfig,lipsum,booktabs}

\usepackage{amssymb}
\usepackage{mathtools}
\usepackage{amsthm}


\usepackage{algorithmic}

\usepackage{lscape}
% if yo {\boldsymbol u} use cleveref..
\usepackage[capitalize,noabbrev]{cleveref}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% THEOREMS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\theoremstyle{plain}

% Todonotes is  during development; simply uncomment the next line
%    and comment out the line below the next line to turn off comments
%\usepackage[disable,textsize=tiny]{todonotes}
\usepackage[textsize=tiny]{todonotes}
\usepackage{multirow}

\usepackage{ascmac}
%\usepackage{fancybx}
\usepackage{float}
\usepackage{perpage}
\MakeSorted{figure}
\MakeSorted{table}

\usepackage{url}
\usepackage{natbib}
\usepackage{chapterbib}

\usepackage{color}
\usepackage{tikz}
\tikzset{%
mynode/.style={circle,minimum width=.5ex, fill=none,draw}, % no filling
myfillnode/.style={circle,minimum width=.5ex, fill=lightgray,draw}, % fill with black
}
\usepackage{amssymb}
\usepackage{natbib}

\newcommand{\0}{$\mathrm{I}$}
\newcommand{\2}{$\mathrm{I}\hspace{-1.2pt}\mathrm{I}$}
\newcommand{\3}{$\mathrm{I}\hspace{-1.2pt}\mathrm{I}\hspace{-1.2pt}\mathrm{I}$}
\newcommand{\4}{$\mathrm{I}\hspace{-1.2pt}\mathrm{V}$}
%\newcommand{\3}{$\mathrm{i}$}
%\newcommand{\4}{$\mathrm{i}\hspace{-0.8pt}\mathrm{i}$}
%\newcommand{\5}{$\mathrm{i}\hspace{-0.8pt}\mathrm{i}\hspace{-0.8pt}\mathrm{i}$}
\newcommand{\6}{$\mathrm{i}\hspace{-0.8pt}\mathrm{v}$}
\newcommand{\indep}{\perp \!\!\! \perp}
\usepackage{amsmath}               
\usepackage{lscape}
\usepackage{algorithm}
%\usepackage[dvipdfmx]{graphicx}
%\bibliographystyle{unsrtnat}
%\DeclareMathOperator*{\argmin}{arg\,min}
%\DeclareMathOperator*{\argmax}{arg\,max}
\usepackage{color}
\usepackage{tikz}
% The \icmltitle yo {\boldsymbol u} define below is probably too long as a header.
% Therefore, a short form for the running title is supplied here:
\usepackage{amsmath,amsthm}
\newtheorem{theorem}{Theorem}
\newtheorem{definition}{Definition}
\newtheorem{assumption}{Assumption}
\newtheorem{lemma}{Lemma}
\newtheorem{proposition}{Proposition}
\newtheorem{corollary}{Corollary}
\usepackage{multirow}
\usepackage{comment}
\usepackage{here}
\allowdisplaybreaks[4]
%\usepackage{bbm}
\usepackage{caption}
\usepackage{bbding}
\usepackage{arydshln}
\usepackage{afterpage}

%\usepackage{algpseudocode}
\usepackage{mathrsfs}
\DeclareMathOperator*{\plim}{p-lim}

\newcommand{\jin}[1]{\textcolor{blue}{[[#1]]}}
\newcommand{\jina}[1]{\textcolor{blue}{#1}}
\newcommand{\yuta}[1]{\textcolor{red}{#1}}
\newcommand{\error}[1]{\textcolor{green}{#1}}
\usepackage{soul}


% If accepted, instead use the following line for the camera-ready submission:
%\usepackage[accepted]{icml2024}

% For theorems and such
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{mathtools}
\usepackage{amsthm}
                 
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2024} % ptmx math instead of Computer
                                         % Modern (has noticeable issues)
% \documentclass[mathfont=newtx]{uai2024} % newtx fonts (improves upon
                                          % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams

%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example

\title{Moments of Causal Effects}

% The standard author block has changed for UAI 2024 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author[1]{\href{mailto:<jj@example.edu>?Subject=Your UAI 2024 paper}{Jane~J.~von~O'L\'opez}{}}
\author[1]{Harry~Q.~Bovik}
\author[1,2]{Further~Coauthor}
\author[3]{Further~Coauthor}
\author[1]{Further~Coauthor}
\author[3]{Further~Coauthor}
\author[3,1]{Further~Coauthor}
% Add affiliations after the authors
\affil[1]{%
    Computer Science Dept.\\
    Cranberry University\\
    Pittsburgh, Pennsylvania, USA
}
\affil[2]{%
    Second Affiliation\\
    Address\\
    …
}
\affil[3]{%
    Another Affiliation\\
    Address\\
    …
  }
  
\begin{document}



Thank you for your valuable feedback. We hope that our responses in the following adequately address your concerns and lead to a positive reassessment of our paper.

>Comment:
The motivation of the results are somehow unclear.

Our response: 
The main motivation is to uncover the shape of the distribution of causal effects through the use of moments, while traditional causal effect assessment focuses on the ``average'' causal effects (ACE). 
We will add the following  to the introduction section to further clarify why understanding the shape of the distribution of causal effects is important:

"The shape of the distribution of causal effects uncovers causal effect heterogeneity, which is an actively researched topic in the field of statistics, causal inference, and machine learning.
Causal effect heterogeneity refers to the variation in causal effects across individuals or subgroups within a population.
Existing work on causal effect heterogeneity mainly examines the conditional average causal effects (CACE), $E[Y_1-Y_0|W]$, based on subjects’ covariates $W$.
However, CACE captures only the heterogeneity across subpopulations specified by observed covariates $W$, not the heterogeneity across individuals.
In contrast, the shape of the distribution of causal effects reveals the heterogeneity of causal effects across individuals and provides complementary information to CACE."

We have discussed how the higher order of moments of causal effects provide useful information on the distribution in the last paragraph of Section 3.1 as follows: ``Variance and standard deviation quantify the dispersion of a distribution.
If the variance of causal effects is large, the causal effects may deviate significantly from ACE for some subjects. Skewness is a measure of the asymmetry of a probability distribution. 
If the causal effect is positively skewed, the right tail of the distribution of the causal effect is longer.
If the causal effect is negatively skewed, the left tail of the distribution is longer.
Kurtosis is a measure of the tailedness or peakedness of a distribution.  High kurtosis values indicate the presence of outliers in causal effects.''

Please note that we have also studied 
conditional moments of causal effects in Appendix D (this is mentioned in the Conclusion section).  They characterize the shape of the distribution of causal effects within a  subpopulation defined by subjects' covariates $W$. 


>Comment:
The idea of the paper is interesting, from a theoretical point of view, but 
I am not convinced that this higher moments would render some insights into the causal effect in practical applications. 
My doubts are somehow confirmed by the comments on the real world example, which are not particularly enlightening of new insights into the behavior of the causal effect. 


Our response: 
Consider the following results on the real-world example in Section 6:

Mean: $3.432$ 

Variance: $3.072$ 

Standard deviation: $1.753$ 

Skewness: $21.027$ 

Kurtosis: $21.312$ 

Assume that we have enough samples such that these estimates are reliable. We have ACE$=3.432$. The relatively large standard deviation suggests that the causal effect exhibits a fair degree of heterogeneity across individuals. 
Our results exhibit substantially greater positive skewness than that of an exponential distribution, which has a skewness of 2, and significantly higher kurtosis than that of a Gaussian distribution, which has a kurtosis of 3. 
The large positive  skewness suggests that  there may be a larger number of individuals having effects smaller than the average $3.432$, rather than larger than $3.432$; and it suggests the existence of a small number of individuals who have effects that are significantly larger than the average, making the average greater than the median.
Finally, the large positive kurtosis value indicates a high number of outliers, which, given the large positive skewness, suggests a high number of individuals with causal effects that are significantly  higher than the average. 
These results offer novel insights into the behavior of causal effects beyond standard ACE, particularly regarding causal effect heterogeneity across individuals. We will update the comments on the results in Section 6 to further explain the meaning of the results. 

>Comment:
When and why this higher moments should be computed?

Our response: 
ACE reflects the average of the causal effects over the entire population. These higher moments of causal effects should be computed when researchers aim to understand the distribution or heterogeneity of the causal effects, to gain a deeper understanding of how causal effects differ across individuals. They can reveal asymmetries in the distribution, suggesting the presence of distinct response groups that might require different treatment strategies. This understanding is crucial for tailoring interventions and designing more
effective policies. This work provides new tools for achieving this goal.









%aim to understand the shape of the causal effect distribution and the causal effect heterogeneity.



%>Comment: \st{(Maybe this could be shown also with interventional data where I guess the restrictive assumption of monotonicty does not need to hold for the identifiability)}



%Our response: \st{Even with interventional data, the monotonicity assumption is necessary for identifiability. In our simulation, we assume interventional data. The exogeneity assumption naturally holds in the context of interventional data.}



\end{document}






%>Comment:The real world example and the simulation are extremely simple cases.

%Our response: \st{Our real-world example and simulation may appear simple; however, they are not "extremely" simple. Rather, they represent some of the most familiar scenarios for researchers and provide valuable and informative insights.} \jin{The response is very unconvincing. Perhaps we ignore this comment.}


%\jin{It looks like you revised my interpretation swiching larger with smaller. But I think my original writing is correct (you should not have deleted them)  and your interpretation is wrong. Please check carefully. Average > median means more number of units have values less than average, but there exists some units that have very large values, exactly opposite to your claim.}



%In Section 6, we stated: "Relatively large standard deviation suggests that the causal effects exhibit some degree of heterogeneity. The positive skewness suggests that the distribution of the causal effect may be positively skewed." These sentences offer novel insights into the behavior of causal effects, particularly regarding causal effect heterogeneity across individuals. Prior works primarily focus on comparing the averages of the potential outcomes, which correspond to the first moments of causal effects, and thus fail to capture detailed information about causal effect heterogeneity.


%\st{Our simulations and application illustrate scenarios in which causal effect heterogeneity exists to a certain extent. We will provide additional experiments in settings where the causal effects are homogeneous, heavily skewed, or exhibit heavy tails, to demonstrate that our results can effectively capture and reveal these phenomena.}