
Instrumental variable (IV) analysis is a powerful tool used to elucidate causal relationships between treatment ($X$) and outcome ($Y$)  when a controlled experiment is not feasible % or when a randomized experiment is not able to successfully treat each unit
\citep{Imbens2014,Angrist2001}. 
Traditionally, there are a large number of works focusing on binary or categorical treatment variables \citep{Imbens1994,Balke1997}; recently, there has been a growing interest in continuous treatment variables \citep{Imbens2004,Kennedy2017,Bahadori2022}.
%, which is the focus of this paper.
%To identify the structural function $\mathbb{E}[Y_x]$ of a continuous treatment using IV, \cite{Whitney2003} introduced an integral equation, which is widely used in the field of machine learning \citep{Hartford2017,Singh2019,Muandet2020}.
%\yuta{Recently, \citep{Wong2022} introduced another integral equation for identifying average partial causal effect (APCE) $\mathbb{E}[\partial_xY_x]$, and \citep{Kawakami2023} developed two estimation methods of APCE, named P-APCE and N-APCE estimator.}
There is also considerable recent interest in estimating  heterogeneous causal effects %of a binary treatment 
across subsets of the population \citep{Athey2016,Ding2016,Athey2019,Kunzel2019,Wager2018,Zhang2022,Singh2023},  %or continuous treatment \citep{Zhang2022} %\jin{Explain what "heterogeneous" causal effect means.} \yuta{Heterogeneity of causal effect is the nonrandom and explainable variability of causal effects   
%\citep{Varadhan2013}. 
including IV-based methods %is also helpful for estimating CACE 
\citep{Angrist2004,Syrgkanis2019,Klein2020,Bargagli2022}. 
Most of the works focus on \emph{conditional average causal effect (CACE)} $\mathbb{E}[Y_1-Y_0|{\boldsymbol w}]$, also known as conditional average treatment effect (CATE), for evaluating heterogeneous causal effects of a binary $X$, where $Y_x$ denotes the potential outcome under treatment $X=x$, and ${\boldsymbol W}$ are covariates (e.g. gender, age, and race).

In this work, we study estimating  heterogeneous causal effects of a \emph{continuous} treatment via the IV method. Existing work in this direction has focused on  estimating $\mathbb{E}[Y_{x}|{\boldsymbol w}]$.  %and relies on a separability assumption for identifying $\mathbb{E}[Y_{x}|{\boldsymbol w}]$ \citep{Whitney2003}.  
The most widely used methods include parametric two-stage least squared (PTSLS) \citep{Wright1928,Angrist2009,Wooldridge2010}, sieve nonparametric two-stage least squared (sieve NTSLS)  \citep{Whitney2003,Chen2018}, and Kernel IV \citep{Singh2019}. The line of works in \citep{Syrgkanis2019,Dikkala2020,Muandet2020,Bennett2023} focus on the efficiency of estimators assuming simple additive errors. % functions. 
All these methods rely on a \emph{separability} assumption for identifying $\mathbb{E}[Y_{x}|{\boldsymbol w}]$ \citep{Whitney2003}. 

Another quantity for evaluating the causal effects of a continuous treatment is average partial causal effect (APCE) $\mathbb{E}[\partial_xY_x]$ \citep{Chamberlain1984,Wooldridge2005,Graham2012}. \citet{Wong2022} provided a condition for identifying $\mathbb{E}[\partial_xY_x]$ and \citet{Kawakami2023} presented APCE estimators. 

In this paper, 
we consider $\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}]$, termed \emph{conditional average partial causal effect (CAPCE)},   to capture the heterogeneous causal effects of a continuous treatment. CAPCE extends APCE and is a natural generalization of the CACE of a binary treatment. 
{The quantity represented by CAPCE has been implicitly studied in the literature (e.g. \citep{Galagate2016}). % but is never formally defined to the best of our knowledge. %and it is not tied to the IV analysis. 
Still existing works have focused on $\mathbb{E}[Y_x|{\boldsymbol w}]$. 
One contribution of this work is to show that under the IV model, CAPCE is identifiable under a \emph{weaker} separability assumption than required by  the previous work (sieve NTSLS, PTSLS, Kernel IV) for identifying $\mathbb{E}[Y_x|{\boldsymbol w}]$. Thus, computing CAPCE allows scientists to estimate causal effects in a larger class of models.
%We present theoretical and empirical results to show the usefulness of formally defining and investigating CAPCE and the merits of estimating CAPCE on behalf of $\mathbb{E}[Y_x|{\boldsymbol w}]$. 
Granted, given an estimated $\mathbb{E}[Y_x|{\boldsymbol w}]$, one can compute its derivative to obtain CAPCE, but not the other way around. However, in practice,  the causal effect from a reference point (e.g., CACE) is often the main interest, and CAPCE is enough to compute causal effects from a reference point: $\displaystyle\mathbb{E}[Y_{x''}-Y_{x'}|{\boldsymbol w}]=\int_{x'}^{x''} \mathbb{E}[\partial_x Y_x|{\boldsymbol w}]dx$.
} 
%It enables us to evaluate the CACE of changing treatments from $x'$ to $x''$ by $\mathbb{E}[Y_{x''}-Y_{x'}|{\boldsymbol w}]=\int_{x'}^{x''}\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}]dx$. 
%Assume the causal relations are represented by structural equations $Y=f_Y(X,{\boldsymbol W},{\boldsymbol H},{\boldsymbol u}_Y)$, $X=f_X({\boldsymbol Z},{\boldsymbol W}, {\boldsymbol H},{\boldsymbol u}_X)$, ${\boldsymbol W}=f_{\boldsymbol W}({\boldsymbol H},{\boldsymbol u}_{\boldsymbol W})$ and denote the potential outcome $Y_{x,{\boldsymbol w}}=f_Y({\boldsymbol z},{\boldsymbol w}, {\boldsymbol H},{\boldsymbol u}_Y)$.
%\yuta{We note that it is possible to estimate CAPCE using parametric two-stage least squared estimation (PTSLS) \citep{Wright1928,Angrist2009,Wooldridge2010}, sieve nonparametric two-stage least squared (sieve NTSLS) estimation \citep{Whitney2003} and Kernel IV \citep{Singh2019}.}  
%However, they required the strict assumption, 
% that the separability;
%$f_Y(X,{\boldsymbol W},{\boldsymbol H},{\boldsymbol u}_Y)=f_Y^1(X,{\boldsymbol W},{\boldsymbol u}_Y)+f_Y^2({\boldsymbol H},{\boldsymbol u}_Y)$ and $\mathbb{E}_{{\boldsymbol H},{\boldsymbol u}_X,{\boldsymbol u}_Y}[{\boldsymbol H}|Z]=0$
%and their methods are indirect approachs for estimating CAPCE via estimating the structural function, $\mathbb{E}[Y_{x}|{\boldsymbol W}={\boldsymbol w}]$.
%We then present a method for identifying CAPCE via IV that extends the results in \cite{Wong2022} for identifying average partial causal effects  $\mathbb{E}[\partial_x Y_{x}]$. Our identification condition requires weaker  assumptions than the previous work (sieve NTSLS, PTSLS, Kernel IV) for identifying  $\mathbb{E}[Y_{x}|{\boldsymbol w}]$. 

We then develop three families of methods for estimating CAPCE: sieve, parametric, and reproducing 
kernel Hilbert space (RKHS)-based, and analyze their statistical properties. %, extending existing estimators for $\mathbb{E}[Y_x|{\boldsymbol w}]$ \citep{Whitney2003,Singh2019} and APCE \citep{Kawakami2023}. 
%\yuta{Our parametric estimator is the generalization of P-APCE estimator, and the other estimators are not in \citep{Kawakami2023}. Unfortunately, we can not apply N-APCE estimator for estimating CAPCE because the integral kernel of the integral equation for identifying CACPE is the conditional density function.}
Finally, we illustrate the proposed  estimators on synthetic data, showing superior performance to existing methods. %sieve NTSLS and PTSLS. 
We also evaluate CAPCE in   a real-world dataset. % to elicit causal relation between years of education and wages, which is of  interest in economics. % \citep{Card1999,Angrist1991}.
%\jin{Draft. In summary, our main contributions are
%\begin{itemize}
%    \item We introduce a new CAPCE)  $\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}]$ to capture the heterogeneous causal effects of a continuous treatment, and present a novel identification method under IV model. Notably, CAPCE is identifiable under a weaker condition than commonly studied ..
%    \item We derive estimators and analyze .... Experimentally show superiority
%\end{itemize}}
%Recently, \citep{Wong2022} introduced another integral equation for identifying average partial causal effects (APCE) $\mathbb{E}[\partial_x Y_{x}]$. In this paper, we give an identification theorem of CAPCE directly as a development of his result. This theorem enables us to directly identify CAPCE under weaker assumptions than the previous work.
%, which is $f_Y(X,{\boldsymbol W},{\boldsymbol H},{\boldsymbol u}_Y)=f_Y^1(X,{\boldsymbol W},{\boldsymbol u}_Y)+f_Y^2({\boldsymbol W},{\boldsymbol H},{\boldsymbol u}_Y)$.
%Our assumption requires the separability only on $X$.
%Next, we give a sieve estimation method of CAPCE based on the orthogonal basis functions \citep{Whitney2003,Ai2003}, and a parametric estimation method.
%Since the integral equation is ill-posed \citep{Tikhonov1995}, we restrict the candidate functions of CAPCE to be in the compact set using the Sobolev norm called consistency norm \citep{Gallant1987}.
%Furthermore, we give the reproducing kernel Hilbert space (RKHS) estimator of CAPCE, which is introduced for a structural function by \citep{Singh2019}.
%Our identification condition and estimation method are superior to PTSLS and sieve NTSLS in the case of research interest is focused on only CAPCE.


%Finally, we analyze the statistical properties of sieve CAPCE estimator, parametric CAPCE estimator and RKHS CAPCE estimator. We illustrate them on synthetic data, comparing with sieve NTSLS and PTSLS, and a real-world dataset evaluating the causal relationship between years of education and wages, which is of great interest in economics \citep{Card1999,Angrist1991}. We especially reveal the heterogeneity of causal effects with the subject's IQ.
