Keywords: causality, graphs for science, graph theory, causal discovery, structure learning
Abstract: In scientific practice, variables are rarely measured at random: they are chosen because experts expect them to be causally relevant and part of the same underlying causal system. This implies that realistic causal graphs should be \emph{sparse}, reflecting simple mechanisms, yet also \emph{connected}, since no variable is truly isolated. Existing continuous optimisation methods for learning directed acyclic graphs (DAGs) enforce sparsity and acyclicity but often produce fragmented structures, contradicting this basic property of scientific data. We address this gap by introducing a spectral regulariser based on \emph{algebraic connectivity}, the Fiedler eigenvalue of the graph Laplacian. The penalty is differentiable, inexpensive to compute, and model-agnostic, and can be added to any learner that outputs a weighted adjacency. We demonstrate its effectiveness in two representative frameworks---GOLEM (likelihood-based, linear Gaussian) and a graph autoencoder (nonlinear encoder–decoder)---without altering their optimisation routines. Across synthetic benchmarks of sparse weakly connected DAGs and Erdős–Rényi DAGs with up to 200 nodes, the regulariser consistently improves global graph structure, yielding larger components and fewer isolated nodes, while preserving or improving edge-level recovery (higher F1, lower SHD and SID). These results establish algebraic connectivity as a principled and practical tool for causal discovery, aligning learned graphs with the way scientific data are collected and offering a simple drop-in enhancement to existing methods.
Submission Number: 54
Loading