Keywords: Graph Representation Learning, Causal Learning, Graph Neural Networks
Abstract: Modeling causal relationships in graph representation learning remains a fundamental challenge. Existing approaches often draw on theories and methods from causal inference to identify causal subgraphs or mitigate confounders. However, due to the inherent complexity of graph-structured data, these approaches frequently aggregate diverse graph elements into single causal variables—an operation that risks violating the core assumptions of causal inference. In this work, we provide a theoretical proof demonstrating that such aggregation compromises causal validity. Building on this result, we propose a theoretical framework grounded in the smallest indivisible units of graph data, ensuring theoretical soundness. With this framework, we further analyze the costs of achieving precise causal modeling in graph representation learning and identify the conditions under which the problem can be simplified. To empirically support our theory, we construct a controllable synthetic dataset that reflects real-world causal structures and conduct extensive experiments for validation. Finally, we develop a causal modeling enhancement module that can be seamlessly integrated into existing graph learning pipelines, and we demonstrate its effectiveness through comprehensive comparative experiments.
Supplementary Material: zip
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 3212
Loading