Abstract: Causal discovery (CD) algorithms are increasingly applied to socially and ethically sensitive domains. However, their evaluation under realistic conditions remains challenging due to the scarcity of real-world datasets annotated with ground-truth causal structures. Whereas synthetic data generators support controlled benchmarking, they often overlook forms of bias, such as dependencies involving sensitive attributes, which may significantly affect the observed distribution and compromise the trustworthiness of downstream analysis. This paper introduces a novel synthetic data generation framework that enables controlled bias injection while preserving the causal relationships specified in a ground-truth causal graph. The framework aims to evaluate the reliability of CD methods by examining the impact of varying bias levels and outcome binarization thresholds. Experimental results show that even moderate bias levels can lead CD approaches to fail to correctly infer causal links, particularly those connecting sensitive attributes to decision outcomes. These findings underscore the need for expert validation and highlight the limitations of current CD methods in fairness-critical applications. Our proposal thus provides an essential tool for benchmarking and improving CD algorithms in biased, real-world data settings.
External IDs:doi:10.1109/access.2025.3573201
Loading