\documentclass{article}

\usepackage{aistats2024_author_response}

\usepackage[utf8]{inputenc} % allow utf-8 input
\usepackage[T1]{fontenc}    % use 8-bit T1 fonts
\usepackage{hyperref}       % hyperlinks
\usepackage{url}            % simple URL typesetting
\usepackage{booktabs}       % professional-quality tables
\usepackage{amsfonts}       % blackboard math symbols
\usepackage{nicefrac}       % compact symbols for 1/2, etc.
\usepackage{microtype}      % microtypography
\usepackage{xcolor}         % define colors in text
\usepackage{xspace}         % fix spacing around commands


\usepackage{amssymb}
\usepackage{mathtools}
\usepackage{amsthm}

\newcommand{\jin}[1]{\textcolor{blue}{#1}}
\newcommand{\yuta}[1]{\textcolor{red}{#1}}
\newcommand{\rev}[1]{\textcolor{blue}{#1}}

\begin{document}

%You have until \textbf{Tuesday, December 5, 2023 (11:59PM Anywhere on Earth)} to (optionally) respond to the reviews. You must submit a single response that addresses all reviews (not one response per review). The author response is limited to a \textbf{single page} in PDF format, including all figures, tables, and references, and has to use the AISTATS "author response'' style that accompanies this \texttt{tex}-file. You may not alter this style file; in particular, you may not change the paper size, font, font size, or margins. Moreover, author responses must not contain external links, and must be \textbf{anonymized}.

%Please focus your response on either answering specific questions raised in the reviews or correcting any misunderstanding or factual errors in the reviews.

%You can change your response as often as you like until the above deadline. Please note that \textbf{this deadline is strict} and we encourage you to submit your response early so as to avoid technical issues. Please be aware that the deadline is \textbf{11:59PM Anywhere on Earth}.


%To include a figure in your response, the following LaTeX code is a possible solution:

%\begin{verbatim}
%\begin{minipage}[b]{0.3\linewidth}
%\includegraphics[width=\linewidth]%{path_to_figure}
%\captionof{figure}{figure_caption}
%\end{minipage}
%\end{verbatim}

%For submissions without a reproducibility checklist, a separate document containing the checklist only is allowed. Please refer to the AISTATS 2024 submission template for the reproducibility checklist. In such case please upload both the author response and the reproducibility checklist either in the same document or in a zip file containing 2 pdf files.



Thank you for your constructive comments. %and suggestions. 
%They are very helpful for us to improve our paper. 
%In the following, your main questions are first stated and then followed by our answers.
Comments not addressed here will be incorporated in the revised paper.\\
{\bf [Reviewer \#1]}\\ 
\textbf{Q.} %There is no motivating example about the benefit of weakening this assumption%, and hence the strength is not sufficiently explained. Given that there are so many errors and typos, I believe this paper requires a major revision. Hence I cannot recommend the acceptance.
%\rev
\emph{Why do we need to weaken the existing separability assumption?} 
\textbf{A.} Weaker assumptions enable computing causal effects in scenarios where existing methods are not applicable, as discussed in the Remark following Theorem 3.1. The experiments clearly demonstrate the superiority of our methods over existing ones when the existing separability assumption is not met in the underlying models.\\
%"\emph{Separability}" is not testable, and Chernozhukov et al. (2007)* say that "{\it nonseparability of a structural disturbance is a key feature of many economic models}". Thus, "\emph{separability}" is often implausible. Hence, weaker assumptions have been desirable for IV methods. We will cite this paper as a motivation. In addition, our application in a real-world dataset in the paper is one of the motivating examples. We will fix errors and typos in the paper.\\
%\ * "Chernozhukov, Victor, Guido W. Imbens, and Whitney K. Newey. Instrumental variable estimation of nonseparable models. Journal of Econometrics 139.1 (2007): 4-14."
\textbf{Q.} \emph{There is no definition of the functions $f_{Y_1}$ and $f_{Y_2}$.} %Given that potential outcome is denoted by $Y_x$, it seems that the authors consider binary treatment $X=1$ and $X=2$ and that they express the structural functions of the potential outcomes when $X=1$ and $X=2$.\\
\textbf{A.} The functions $f_{Y_1}$ and $f_{Y_2}$ in Eq. (2) and Assumption 3.2 simply mean that the function $f_Y$ can be decomposed into the summation of some two functions $f_{Y_1}$ and $f_{Y_2}$.
We will rewrite $f_{Y_1}$ and $f_{Y_2}$ to $f_{Y}^1$ and $f_{Y}^2$ and provide an explanation in the text to avoid confusion.\\
\textbf{Q.}
\emph{In some results in Table 1, the performance is worse than the existing method, PTSLS.}
\textbf{A.}
%It is not right.
%The result in Table 1 does not mean the performance is worse than the existing method, PTSLS. 
%It shows the estimated coefficients by PTSLS are biased; on the other hand, the estimated coefficients by P-CAPCE are not biased.
We assume you refer to the results for the coefficient corresponding to "1". One cannot draw such a conclusion. When $N=1000$, all the estimates have large variances (see Tables G.1 and G.2 in the appendix) such that those differences in estimated values are not statistically significant. The main observation is that the estimated coefficient for $W$ by PTSLS is biased, while the estimates by P-CAPCE are converging to the true values with larger sample sizes.\\
%\textbf{Q4.}  In Section 1, please clearly illustrate the motivation of this work, using several real-world examples (Synthetic experimental results might be helpful.) Why do we need to weaken the existing separability assumption? Without describing the motivation, readers cannot clearly understand the significance of this work.\\
%\textbf{A4.} 
%We will clearly illustrate the motivation in Section 1 as described in the answer A1.
%\textbf{Q5.}  In Section 1, “covariates such as gender, …” seems incorrect. That depends on the causal graph and the problem setup. For instance, in the community of fairness-aware machine learning, gender is often regarded as a treatment variable. Please do not mix the general setup and the example real-world scenarios.\\
%\textbf{A5.}
%We will delete "such as gender, ..." to avoid confusing.
%\textbf{Q6.}  In Section 2, “In practice, we normally do not know the details of the structural functions…” seems unclear. Can we consider abnormal cases? Synthetic data experiments?\\
%\textbf{A6.}
%In practice, we normally do not know the details of the structural functions, and we know the details of the structural functions in synthetic data experiments.
%\textbf{Q.}  In Section 3, “CAPCE is a generalization of CACE for continuous treatment” seems incorrect. We can consider CACE for continuous treatment values $X = x$, $x'$ as $E[Y_{x'} | {\boldsymbol w}] - E[Y_x |{\boldsymbol w}]$, right?\\
%\textbf{A.}
%Both CAPCE and $E[Y_{x’}| {\boldsymbol w}] - E[Y_x|{\boldsymbol w}]$ are the generalizations of CACE. And, $E[Y_{x’}| {\boldsymbol w}] - E[Y_x|{\boldsymbol w}]$ can be calculated by CAPCE.
%CAPCE is a generalization of CACE in the sense that CAPCE encodes the information $E[Y_{x’}| {\boldsymbol w}] - E[Y_x|{\boldsymbol w}]$ for any continuous treatment values $X = x, x'$ while CACE is normally defined for a binary treatment $X$.
\textbf{Q.}
\emph{In Assumpt. 3.1, the def. of each subject is not given. 
Are you referring to the observed data instance or to a set of all realizations of all variables, including exogenous variables?}
%In causal inference, we usually regard the latter as a subject, but here, it seems unclear. 
\textbf{A.}
We are referring to the latter, a realization of ${\boldsymbol U}={\boldsymbol u}$.\\ %of all variables, including exogenous variables.
%\textbf{Q8.}  As already pointed out, the functions in Assumptions 3.2 are never defined.
%\textbf{A8.} $f_Y$ is defined in Eq. (1), and we define $f_{Y_1}$ and $f_{Y_2}$ in Assumption 3.2. Assumption 3.2 mean $f_Y$ can be decomposed by some two functions $f_{Y_1}$ and $f_{Y_2}$.
\textbf{Q.}  \emph{“Interaction” also seems unclear. Are you simply mean the product terms in structural equations, or any other formal concepts?} 
\textbf{A.} We refer to the product terms in structural equations.\\
\textbf{Q.}  \emph{In Theorem 3.1, it seems that the CDFs in function $k$ are not denoted by $\mathbb{P}$. Is it an error?}
\textbf{A.} No, it is not an error.
We use the symbol $\mathfrak{p}$ since it is a PDF (notation defined in the first paragraph of Section 2).\\
\textbf{Q.} \emph{In Section 4, the consistency theorem in Theorem 4.5 requires to set regularization parameter $\lambda_3=0$. ... %Is it OK in practice? 
This seems to lead to numerical instability when inverting Gram matrices. Am I correct? }
\textbf{A.} Yes, you are correct.
%Normally, regularization provokes some bias instead of stability of estimation. \jin{What do `` provokes some bias instead of stability of estimation'' mean? }
%\yuta{Regularization adds a penalty term to the risk function, which decreases variance but increases bias.}
%\jin
The theorem only holds under $\lambda_3=0$ since regularization leads to bias. But in practice, we must consider the bias-variance trade-off.\\
\textbf{Q.}  \emph{In Section 4, the authors develop three families of estimators of CAPCE. ... but how can we choose one in practice? }
\textbf{A.} 
{The %advantages/disadvantages 
merits of sieve, parametric, and RKHS estimators are well-understood in the literature. Roughly, the performance of parametric methods relies on correct parametric model assumption, and RKHS methods are computationally expensive.}\\
%We developed three families of estimators of CAPCE, P-CAPCE, S-CAPCE, and RKHS CAPCE. We can not choose the one among them in practice since it is case-by-case and untestable. We are only able to compare the results using those methods.
{\bf [Reviewer \#3]}\\ 
%\textbf{Q1.} Some of the notation in the introduction is not defined until much later in the paper or at all (e.g. $E[Y_x|{\boldsymbol w}]$, $Y_1, Y_2$ underscores in separability definition, $z_0$, etc.)\\
%\textbf{A1.} We will give the definition $\mathbb{E}[Y_x|{\boldsymbol w}]=\mathbb{E}_{\boldsymbol U}[Y_x({\boldsymbol U})|{\boldsymbol W}={\boldsymbol w}]$ like Definition 1 in the revised paper. We define $f_{Y_1}$ and $f_{Y_2}$ in Assumption 3.2, and $z_0$ in Theorem 3.1, respectively. Assumption 3.2 mean $f_Y$ can be decomposed additively by some two functions $f_{Y_1}$ and $f_{Y_2}$. We will rewrite $f_{Y_1}$ and $f_{Y_2}$ to $f_{Y}^1$ and $f_{Y}^2$ to avoid misunderstanding. And, $z_0$ is an arbitrary value of $\Omega_Z$.
\textbf{Q.} 
\emph{Based on the diagram in Figure 1, ... %it appears that there is no selection into the instrument, i.e. 
the instrument cannot be affected by observed covariates. 
Why is that and how would the analysis change if the instrument was only randomized conditional on ${\boldsymbol W}$?}
\textbf{A.} Great question. If there is an edge ${\boldsymbol W}\rightarrow Z$, then Theorem 3.1 as it is will not hold. However, a similar identification result can be derived that uses $\mathbb{P}(Y|Z, {\boldsymbol W})$ as input instead of $\mathbb{P}(Y|Z)$. We will add an appendix to discuss this setting.\\
%Our assumptions are satisfied if the instrument was only randomized conditional on $W$.
%This paper follows the previous study [27] (Figure 1).
%\jin{Are you saying the results in the paper remain the same given an edge from W to Z? But I previously asked this question and you replied the results will not be the same. See the following commented out text in Notation.tex: } \jin{Do the results hold with Z = f(W, ) or an edge from W to Z in Figure 1?} \yuta{[If there is an edge from W to Z in Figure 1, the distribution $P(X,W|Z)$, say $P(X|Z)$, is biased.]}
%\yuta{Our assumptions are not satisfied even if the instrument was only randomized conditional on $W$. When we change $Z\sim U[0,1]$ to $Z\sim W+U[0,1]$ in the setting (A) ($N=10000$), the estimated coefficients by P-CAPCE are 43.367, 33.229, and 50.916, respectively for $1$, $W$, and $X$. These are largely biased.}
%\jin{Yuta, I don't think this answers the question. If there is edge $W\rightarrow Z$, theoretically, why will the identifiability and estimation results in the paper not hold? How would the analysis change?}
%\yuta{Our assumptions are not satisfies if there is an edge $W\rightarrow Z$ in Figure 1 because the regression $Y$ on $Z$, $\mu(z)=\mathbb{E}[Y|Z=z_0]-\mathbb{E}[Y|Z=z]$ in Theorem 3.1, violate exogeneity.} \jin{Not sure what "biased'' means. Can a result similiar to Theorem 3.1 be derived? That is, mayby you just need to update some expressions in Theorem 3.1?}
%\yuta{If there is an edge $W\rightarrow Z$ in Figure 1 and SCM ${\cal M}_{IV}$, we can use the following theorem:\\
%{\bf Theorem 3.1'.}
%{\it Under SCM ${\cal M}_{IV}$ and Assumptions 3.1 and 3.2, CAPCE $\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}]$ is identifiable from distributions $\mathbb{P}(X|Z,{\boldsymbol W})$ and $\mathbb{P}(Y|Z,{\boldsymbol W})$ via the integral equation:
%\begin{eqnarray}
%\mu(z,{\boldsymbol w})=\int_{\Omega_X} k(z,x,{\boldsymbol w})\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}] dx,
%\end{eqnarray}
%where  $\mu(z,{\boldsymbol w})=\mathbb{E}[Y|Z=z_0,{\boldsymbol W}={\boldsymbol w}]-\mathbb{E}[Y|Z=z,{\boldsymbol W}={\boldsymbol w}], k(z,x,{\boldsymbol w})=\mathfrak{p}(X\leq x|Z=z,{\boldsymbol W}={\boldsymbol w})-\mathfrak{p}(X\leq x|Z=z_0,{\boldsymbol W}={\boldsymbol w})$, and $z_0$ is a fixed value.}
%}
%\jin{1. 3.1' assumes the same separability as 3.1? 2. Given we could use 3.1's even if there is no edge from W to Z, what are the disadvantages of using 3.1' to estimate CAPCE? Otherwise, why not always use 3.1's instead of 3.1 regardless whether there is the edge W to Z?}
%\yuta{Theorem 3.1' assumes the separabilty as Assumption 3.1 and 3.2. The disadvantage of Theorem 3.1' is that we have to learn $\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}]$ by functions on $x$ for each ${\boldsymbol w} \in \Omega_{\boldsymbol W}$ respectively. The advantage of Theorem 3.1 is that we can learn $\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}]$ by one function on $x$ and ${\boldsymbol w}$.}
%\yuta{By {\bf Theorem 3.1'}, we can not calculate the $X$-${\boldsymbol W}$ plane as Figure G.1 and G.2 in Appendix. We can calculate only the curve of $\mathbb{E}[\partial_x Y_{x}|{\boldsymbol w}]$ given the values ${\boldsymbol W}={\boldsymbol w}$ as Figure 2.}
\textbf{Q.} \emph{Assumption 3.1 (4) is not usual. Can you expand more on that?} 
\textbf{A.} Assumption 3.1 (4) is needed for identifiability and has appeared in many papers such as [41] and [53]. Many commonly used models such as exponential families and location-scale families satisfy this assumption. See (Andrews 2017, Journal of Econometrics) for more discussion.\\
%Assumption 3.1 (4) has appeared in many papers such as [41], [53], and\\ "Miao, W., Geng, Z. and Tchetgen Tchetgen, E. J., 2018. Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika, 105(4), pp.987-993."\\Miao et al. explain the completeness condition well in page 4.
%\jin{Ideally, we'd want to provide a compact explanation here instead of referring to other papers. Can you provide Miao's explanation here to see whether we can provide a brief summary explanation here? }
%\yuta{Assumption 3.1 (4) is needed for the uniqueness of the integral equation (3).
%Assumption 3.1 (4) has appeared in many papers such as [41] and [53]. This assumption is related to the full rank condition of parametric linear IV models, and many commonly used models such as exponential families and location-scale families satisfy this assumption.}
\textbf{Q.} 
\emph{In the proof of Theorem 3.1, lines (8)-(9) include an additional integration by $w$ which is incorrect.}
\textbf{A.}
We will fix it.\\
\textbf{Q.} 
\emph{Theorem 4.1 and 4.2: what are the guarantees for arbitrary regressors? 
I.e. can you give guarantees w.r.t. the guarantees of the $\hat{\mathbb{E}}$ estimators?} 
\textbf{A.} 
%\yuta{We do not discuss the consistency of arbitrary regressors. We make assumptions following previous studies NTSLS and Kernel IV. The consistency of arbitrary regressors will be the future work.}
%\jin{Yuta, I'm not sure I understand the question. So, Theorem 4.1 and 4.2 only hold for the regressors using the power series basis functions $q_i(z)$? Then answer as follows: Unfortunately, we do not have results for arbitrary regressors.}
%\yuta{[Comment] Yes, Theorem 4.1 and 4.2 only hold for the regressors using the power series basis functions $q_i(z)$.}
Unfortunately, we do not have results for arbitrary regressors.\\
\textbf{Q.} \emph{Do you have any evaluations with more than one heterogeneity feature?}
\textbf{A.} We will do this evaluation in the revision.\\
%Due to the limited time, we can not provide the experiments with more than one heterogeneity feature now.
%\jin{Do we plan to experiment with more than one heterogeneity feature? Can we be positive and promise to do such experiments? }
%We plan to experiment with more than one heterogeneity feature in the revised paper.
\textbf{Q.} \emph{Can you provide a comparison of the computational costs to those of the alternatives?}
\textbf{A.}
The costs are similar to the corresponding alternatives. We will add time to Tables 1 and 2 in the revision. E.g., for setting (A), $N=1000$: PTSLS 0.131, NTSLS 0.342, Kernel IV 6.121, P-CAPCE 0.146, S-CAPCE 0.462, RKHS CAPCE 6.502 in seconds.\\
%Due to the limited time, we can not provide the experiments to compare the computational costs. 
%\yuta{We compare the computational times under the setting (A), $1$ iteration, and $N=100$. The computational times are 0.131, 0.342, 6.121, 0.146, 0.462, and 6.502 seconds, respectively, for PTSLS, NTSLS, Kernel IV, P-CAPCE, S-CAPCE, and RKHS CAPCE.}
%Our methods take slightly more computational cost than those of the alternatives.
%\textbf{Q7.} I recommend highlighting in the main text the appendix results that demonstrate comparable performance with existing methods in situations where strong separability isn't a concern. I think that would strengthen your argument as to why these methods are so powerful.\\
%\textbf{A7.} We will highlight in the main text the appendix results that demonstrate comparable performance with existing methods in situations where strong separability isn't a concern.
{\bf [Reviewer \#4]} \\ 
%\textbf{Q.}
%No code is released. Computing infrastructure is not mentioned. Please release the reproducibility checklist.\\
\textbf{A.}
%We provide plenty of information to reproduce the experiments in the body of our paper and Appendix. \jin{Can you promise to release the code?}
We will release the code on publication. Experiments used Apple M1 (16GB). 
Reproducibility checklist is attached.\\
%\textbf{Q2.} Tables 1 and 2: bold best results for readability.\\
%\textbf{A2.} We will make the best results bold.
{\bf [Reviewer \#12]}\\ 
\textbf{Q.} 
%The results are significant compared to the baselines. I think the experimental section is well designed to show the difference in assumptions, and I appreciate including a real-world example, but 
\emph{I would have also liked to see a comparison to any method that directly estimates partial causal effects, for example, based on the work in [22].} 
\textbf{A.}
[22] uses a non-IV method, i.e., the assumptions employed in [22] differ from those of the IV setting. In fact, we are not aware of any work that directly estimates partial causal effects under the IV setting. \\
%Due to the limited time, we can not provide the comparison with [22]. 
%\yuta{They are focused on the propensity score-based methods. In addition, [46] and Hansen et al. (2014) ** directly estimate partial causal effects by kernel method and sieve regression, respectively.} However, our methods would be better than non-IV methods in the IV setting.
%\jin{Yuta, so [22] do not assume IV setting? [46] and Hansen et al. (2014) do not assume IV setting either? Can we claim "we are not aware of work that directly estimates partial causal effects under the IV setting"? }
%\yuta{[Comment] Yes, both [22], [46] and Hansen et al. (2014) do not assume the IV setting, and we can claim "we are not aware of work that directly estimates partial causal effects under the IV setting."}
%\yuta{** "Hansen, Bruce E., Nonparametric Sieve Regression: Least Squares, Averaging Least Squares, and Cross-Validation, The Oxford Handbook of Applied Nonparametric and Semiparametric Econometrics and Statistics (2014)."}
\textbf{Q.} \emph{For the real-world dataset, how many samples are left after excluding the missing values?}
\textbf{A.} 857 samples are left. \\
%For the real-world dataset, 857 samples are left after excluding the missing values.
\textbf{Q.} \emph{I think the model selection aspect is a bit glossed over. ... 
%The authors mention that standard methods are used for model selection, but how much does that affect the problem? 
What about a sensitivity analysis relative to model selection? }
\textbf{A.} We did not explain the model selection much since it is well-studied in ML. Model selection does have an impact on the estimation results. We will provide a sensitivity analysis in the revision. \\
%Model selection influences the estimation problem much. However, we do not explain the model selection much since it is well-studied in the machine learning area. We can consider a sensitivity analysis relative to model selection but do not discuss it in the paper due to the limited pages.
\textbf{Q.} \emph{When it comes to the IV assumptions, why is relevance highlighted in particular in Assumption 3.1? 
Are the other two assumptions not essential for this work? 
Can they be relaxed for this approach?} 
\textbf{A.} The exclusion restriction and exchangeability assumptions are still essential for this work, but they are implied by the SCM ${\cal M}_{IV}$ setting. 
%We think you refer to three IV assumptions: the relevance assumption, the exclusion restriction, and the exchangeability assumption. We do not highlight the other two assumptions besides the relevance assumption in Assumption 3.1 because they are implied from the SCM setting. They are still essential in our work.
%\textbf{Q4.} Minor comments: ...\\
%\textbf{A4.} We will fix those.



\section*{Checklist}


% %%% BEGIN INSTRUCTIONS %%%
%The checklist follows the references. For each question, choose your answer from the three possible options: Yes, No, Not Applicable.  You are encouraged to include a justification to your answer, either by referencing the appropriate section of your paper or providing a brief inline description (1-2 sentences). 
%Please do not modify the questions.  Note that the Checklist section does not count towards the page limit. Not including the checklist in the first submission won't result in desk rejection, although in such case we will ask you to upload it during the author response period and include it in camera ready (if accepted).

%\textbf{In your paper, please delete this instructions block and only keep the Checklist section heading above along with the questions/answers below.}
% %%% END INSTRUCTIONS %%%


 \begin{enumerate}


 \item For all models and algorithms presented, check if you include:
 \begin{enumerate}
   \item A clear description of the mathematical setting, assumptions, algorithm, and/or model. [Yes]
   \item An analysis of the properties and complexity (time, space, sample size) of any algorithm. [Yes]
   %\item (Optional) Anonymized source code, with specification of all dependencies, including external libraries. [Yes/No/Not Applicable]
 \end{enumerate}


 \item For any theoretical claim, check if you include:
 \begin{enumerate}
   \item Statements of the full set of assumptions of all theoretical results. [Yes]
   \item Complete proofs of all theoretical results. [Yes]
   \item Clear explanations of any assumptions. [Yes]     
 \end{enumerate}


 \item For all figures and tables that present empirical results, check if you include:
 \begin{enumerate}
   \item The code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL). [No]. We will release the code upon publication.
   \item All the training details (e.g., data splits, hyperparameters, how they were chosen). [Yes]
    \item A clear definition of the specific measure or statistics and error bars (e.g., with respect to the random seed after running experiments multiple times). [Yes]
    \item A description of the computing infrastructure used. (e.g., type of GPUs, internal cluster, or cloud provider). [Yes]. We will add the information that the experiments were performed using Apple M1 (16GB).
 \end{enumerate}

 \item If you are using existing assets (e.g., code, data, models) or curating/releasing new assets, check if you include:
 \begin{enumerate}
   \item Citations of the creator If your work uses existing assets. [Yes]. We used an open dataset in the R package “wooldridge" (https://cran.r-project.org/package=wooldridge) as mentioned in Section 6.
   \item The license information of the assets, if applicable. [Not Applicable]. We use an open dataset.
   \item New assets either in the supplemental material or as a URL, if applicable. [Not Applicable]
   \item Information about consent from data providers/curators. [Not Applicable]. We use an open dataset.
   \item Discussion of sensible content if applicable, e.g., personally identifiable information or offensive content. [Not Applicable]. 
 \end{enumerate}

 \item If you used crowdsourcing or conducted research with human subjects, check if you include:
 \begin{enumerate}
   \item The full text of instructions given to participants and screenshots. [Not Applicable]
   \item Descriptions of potential participant risks, with links to Institutional Review Board (IRB) approvals if applicable. [Not Applicable]
   \item The estimated hourly wage paid to participants and the total amount spent on participant compensation. [Not Applicable]
 \end{enumerate}

 \end{enumerate}


\end{document}
