$\psi$DAG: Projected Stochastic Approximation Iteration for Linear DAG Structure Learning

Klea Ziu; Slavomir Hanzely; Loka Li; Kun Zhang; Martin Takáč; Dmitry Kamzolov

$\psi$DAG: Projected Stochastic Approximation Iteration for Linear DAG Structure Learning

Klea Ziu, Slavomir Hanzely, Loka Li, Kun Zhang, Martin Takáč, Dmitry Kamzolov

11 Apr 2025 (modified: 29 Oct 2025)Submitted to NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Structure learning, Continuous Optimization, Directed Acyclic Graphs

TL;DR: We propose a scalable framework for learning DAGs using Stochastic Approximation with SGD and efficient projection methods, offering improved performance and computational efficiency for large-scale problems.

Abstract: Learning the structure of Directed Acyclic Graphs (DAGs) presents a significant challenge due to the vast combinatorial search space of possible graphs, which scales exponentially with the number of nodes. Recent advancements have redefined this problem as a continuous optimization task by incorporating differentiable acyclicity constraints. These methods commonly rely on algebraic characterizations of DAGs, such as matrix exponentials, to enable the use of gradient-based optimization techniques. Despite these innovations, existing methods often face optimization difficulties due to the highly non-convex nature of DAG constraints and the per-iteration computational complexity. In this work, we present a novel framework for learning DAGs, employing a Stochastic Approximation approach integrated with Stochastic Gradient Descent (SGD)-based optimization techniques. Our framework introduces new projection methods tailored to efficiently enforce DAG constraints, ensuring that the algorithm converges to a feasible local minimum. With its low iteration complexity, the proposed method is well-suited for handling large-scale problems with improved computational efficiency. We demonstrate the effectiveness and scalability of our framework through comprehensive experimental evaluations, which confirm its superior performance across various settings.

Primary Area: Optimization (e.g., convex and non-convex, stochastic, robust)

Submission Number: 1827

Loading