%
\textbf{Statistical and Causal Models.}
We assume that the stochastic process $\{x_t\}_{t \in \mathbb{Z}} \in \mathbb{R}^d$ follows a weakly stationary vector autoregressive model(VAR(p)) of order $p$ for some $p,d \in \mathbb{N}$ which is defined as 
 \begin{equation}
 \label{eq:VAR}
     x_t = A_1 x_{t-1} + A_2 x_{t-2} + \cdots A_P x_{t-p} + \epsilon_t, \;  
 \end{equation}
 where $x_t \in \R^d$ is a vector-valued time-series, for all $i \in [p]$, $A_i \in \R^{d \times d}$ are the coefficients of the VAR model, and $\epsilon_t \in \R^{d}$ denotes the noise vector such that $ \E[\epsilon_t] = 0$ and $\E[\epsilon_t \epsilon_{t+h}^T] =  \Sigma_{\epsilon} \textrm{ if} \; h = 0$ and $0 \textrm{ otherwise}.$ For some $\sigma_{\epsilon}^2 > 0$, we simply set $\Sigma_{\epsilon} = \sigma^2_{\epsilon} \mathbbm{I}$ for enhanced readability. Our results can be easily generalized to arbitrary covariance matrices by means of the spectral properties ($\lambda_{\min}, \lambda_{\max}$) of $\Sigma_{\epsilon}$. The autocovariance matrix of $\myCurls{x_t}_{t \in \mathbb{Z}}$ plays a central role in our results and analysis. For any $n \in \N$, we use $\Sigma_{n}$ to denote the autocovariance matrix of size $n$ defined as $ \E [(y^n_{t} - \E[y^n_{t}]) (y^n_{t} - \E[y^n_{t}])^T]$. It is convenient to rewrite a VAR model of order $p$ in Equation (\ref{eq:VAR}) as a VAR(1) model, $ y_t = A y_{t-1} + e_t$, where $y_t \in \mathbb{R}^{dp}, e_t \in \mathbb{R}^{dp}$ are defined as $y_t = \begin{pmatrix} x_{t}, x_{t-i} , \cdots, x_{t-p+1}\end{pmatrix}^T$, $e_t = \begin{pmatrix} \epsilon_t, 0, \cdots, 0 \end{pmatrix}^T$, and $A \in \mathbb{R}^{dp \times dp}$ is a \textit{(multi) companion matrix} defined as: 
    \begin{align}
    \label{eq:var_1_def}
    %
        A = \begin{pmatrix}
        A_1 & A_2 & \cdots & A_{p-1} & A_p \\
        I & 0 & \cdots & 0 & 0 \\
        0 & I & \cdots & 0 & 0 \\
        \vdots &\vdots &\cdots &\vdots &\vdots \\
        0 & 0 & \cdots & I & 0
        \end{pmatrix}.
    \end{align}
The eigenvalues of the multi-companion matrix $A$ fully characterize the stability and stationarity of the VAR process. For a VAR(p) process to be weakly stationary, that is for the mean and the covariance of the process to not change over time, the eigenvalues of $A$, which satisfy 
\begin{equation}
\textrm{det} \abs{\mathbb{I}_d \lambda^p - A_1 \lambda^{p-1} - A_2 \lambda^{p-2} - \cdots - A_p} = 0,
\end{equation}
are constrained to not lie on the unit circle. If the magnitude of all the eigenvalues are $\abs{\lambda_i} < 1$, then the process is stable, that is, its values do not diverge \parencite{lutkepohl2013vector}.

\textbf{Causal Models.} Under the assumptions of causal sufficiency and absence of contemporaneous influences, a causal interpretation of the VAR model in \eqref{eq:VAR} as structural equations naturally yields the corresponding causal model. We consider the family of all VAR models as our function class $\mathcal{F}$ of statistical and causal estimators.