The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications
Keywords: Causal discovery, evaluation metrics, real-world datasets
Abstract: Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many scientific disciplines. However, its real-world applicability remains limited, mainly due to poor data practices, such as an overreliance on unrealistic datasets and inadequate evaluation metrics. This paper systematically reviews the recent causal discovery literature, highlighting the disconnect between current benchmarking practices and practical applications. We present applications from biology, neuroscience, and Earth sciences—fields where causal discovery holds promise for addressing key challenges. We catalog available simulated and real-world datasets from these domains and discuss common assumption violations that have spurred the development of new methods. Finally, we recommend that the causal discovery community adopt more adequate metrics and use a more diverse range of realistic datasets.
Submission Number: 11
Loading