Causally Fair Node Classification on Non-IID Graph Data

TMLR Paper7064 Authors

19 Jan 2026 (modified: 17 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Fair machine learning seeks to identify and mitigate biases in predictions against unfavorable populations characterized by demographic attributes, such as race and gender. Recent research has extended fairness to graph data, such as social networks, but many neglect the causal relationships among data instances. This paper addresses the prevalent challenge in fair ML algorithms, which typically assume Independent and Identically Distributed (IID) data, from the causal perspective. We base our research on the Network Structural Causal Model (NSCM) framework and develop a Message Passing Variational Autoencoder for Causal Inference (MPVA) framework to compute interventional distributions and facilitate causally fair node classification through estimated interventional distributions. Theoretical soundness of the proposed method is established under two general and practical conditions: Decomposability and Graph Independence. These conditions formalize when interventional distributions can be computed using do-calculus in non-IID settings, thereby grounding the framework in rigirous causal inference theory rather than imposing ad hoc constraints. Empirical evaluations on semi-synthetic and real-world datasets demonstrate that MPVA outperforms conventional methods by effectively approximating interventional distributions and mitigating bias. The implications of our findings underscore the potential of causalitybased fairness in complex ML applications, setting the stage for further research into relaxing the initial assumptions to enhance model fairness.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Mingming_Gong1
Submission Number: 7064
Loading