
Many techniques have been developed to infer causal relations among random variables from observational data~\citep{Spirtes2000,Chickering2003,Shimizu2006,Peters2014,Zheng2018}. While most of these procedures focus on structure learning among scalar variables, there has been steadily increasing interest in settings where groups of measurements constitute the causal entities of interest~\citep[see][for an overview]{Wahl2024}. For instance in neuroscience, causal connections between brain regions rather than individual neurons are often of interest~\citep{Panzeri2017, Kohn2020, Semedo2020}. In earth and climate science, researchers are frequently interested in causal interactions among groups of measurements (e.g., wind speed, air pressure, etc.) across a number of grid locations spanning the planet \citep[and references therein]{Runge2015}. In psychology and social sciences studies often involve the measurement of certain psychological or societal traits based on several proxy variables that need to be treated jointly \citep{Cronbach1955, Campbell1959, Antonoplis2022}. In industrial manufacturing, quality control systems often record several related measurements from automated process, e.g., industrial welding, injection molding, or staking and pressing, where causal relationships among several such process are the relevant causal items \citep{Vukovic2022,Kikuchi2023,Goebler2024}.

\begin{figure}
  \centering
  \scalebox{0.6}{
    \input{./Figures/overview.tex}
  }
  \caption{An example of a grouped additive noise model with three groups and varying group sizes.}
  \label{fig:ganm_example}
\end{figure}

There are a number of ways to incorporate known group structures in causal learning tasks. The most elementary approach is to apply dimension reduction, e.g., by taking the mean across all members in one group. Given the resulting set of scalar summary variables, standard causal discovery methods may be employed. This comes at the cost of severe information loss, and may even render conditional independence results useless \citep{Wahl2024}. Another approach disregards the grouping structure during the learning tasks and applies causal discovery techniques to the group members. In a second step, the resulting graph estimate is appropriately coarsened to represent the variable groupings~\citep[see][]{Rubenstein2017,Chalupka2016, Parviainen2017, Anand2023}. A last approach seeks to treat the groups themselves as the causal quantities of interest and aims at performing causal learning on the groups directly \citep{Janzing2010, Zscheischler2011, Entner2012, Wahl2023}. Our work belongs to the latter framework of group causal learning.

Motivated by the work of \citet{Peters2014} on nonlinear causal discovery in continuous additive noise models (ANMs) for multivariate data, we revisit the identifiability of causal directions in the group setting. In particular, we show that in general, the causal directions can be identified in the group setting. The statistical problems involved when operating with random vectors rather than scalar random variables become much more challenging. We construct an order independent version of the regression with subsequent independence test (RESIT) algorithm~\citep{Peters2014} in the group setting and propose efficient and flexible solutions to the estimation problems involved.

In particular, the model selection task after having obtained a causal order requires special attention. We propose a novel class of multi-response group sparse additive models (MURGS) for selecting relevant edges. MURGS is a stand-alone feature selection procedure specifically designed for the grouped setting and can be seamlessly integrated into the pruning phase of any order-based method \citep{Teyssier2012}. For example, \citet{Entner2012} estimate causal orderings using grouped linear non-Gaussian ANMs, and we anticipate that other order-based approaches --- such as the score matching method introduced by \citet{Rolland2022} --- can be extended to the group case.

In summary our main contributions are the following:
\begin{itemize}
  \item We propose a flexible and fully nonparametric learning strategy to first obtain a causal order by means of neural networks and nonparametric independence tests. Then, for DAG pruning, we introduce MURGS, a class of multi-response sparse additive models to encourage sparsity on the group level and analytically derive a closed-form backfitting update for the corresponding block coordinate descent algorithm.
  \item We evaluate our proposed method on synthetic data and demonstrate superior performance compared with several other causal discovery algorithms. Further, we consider real world manufacturing data with partially known causal ordering that allows us to partly assess algorithmic performance.
\end{itemize}

The remainder of the paper is structured as follows. Section~\ref{sec:methodology} introduces the group ANM and establishes identifiability results in the group setting. Next, in Section~\ref{sec:groupresit} we describe the two phases of the GroupRESIT algorithm including the development of MURGS for model selection. Section~\ref{sec:experiments} presents the results of our synthetic experiments, and in Section~\ref{sec:real_data} we apply our methods to real data from industrial manufacturing. We conclude in Section~\ref{sec:conclusions}.
