Causal Structural Hypothesis Testing and Data Generation ModelsDownload PDF

03 Oct 2022 (modified: 29 Apr 2024)Neurips 2022 SyntheticData4MLReaders: Everyone
Keywords: causality, generative modeling, generalization, hypothesis testing
TL;DR: We propose an architecture for comparing and using various causal structural priors to test hypotheses and generate new datapoints in and out of the training data distribution.
Abstract: A vast amount of expert and domain knowledge is captured by causal structural priors, yet there has been little research on testing such priors for generalization and data synthesis purposes. We propose a novel model architecture, Causal Structural Hypothesis Testing, that can use nonparametric, structural causal knowledge and approximate a causal model’s functional relationships using deep neural net- works. We use these architectures for comparing structural priors, akin to hypothesis testing, using a deliberate (non-random) split of training and testing data. Extensive simulations demonstrate the effectiveness of out-of-distribution generalization error as a proxy for causal structural prior hypothesis testing and offers a statistical baseline for interpreting results. We show that the variational version of the architecture, Causal Structural Variational Hypothesis Testing can improve performance in low SNR regimes. Due to the simplicity and low parameter count of the models, practitioners can test and compare structural prior hypotheses on small dataset and use the priors with the best generalization capacity to synthesize much larger, causally-informed datasets. Finally, we validate our methods on a synthetic pendulum dataset, and show a use-case on a real-world trauma surgery ground-level falls dataset. Our code is available on GitHub.2
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2210.11275/code)
4 Replies

Loading