Bayesian AutoML for Databases via the InferenceQL Probabilistic Programming SystemDownload PDF

Published: 16 May 2022, Last Modified: 05 May 2023AutoML 2022 (Late-Breaking Workshop)Readers: Everyone
Abstract: InferenceQL is a probabilistic programming system for scalable Bayesian AutoML from database tables. InferenceQL is designed to help make Bayesian approaches to data analysis more accessible to broad audiences and to assist experts in auditing and improving the quality of data, models, and inferences. Unlike traditional probabilistic programming systems, InferenceQL provides automation for learning models using nonparametric Bayesian structure learning of probabilistic programs. Experts can override these models with custom probabilistic programs for specific subsets of variables and conditional distributions. For a broad class of models, InferenceQL can generate realistic synthetic data subject to constraints and can automatically compute exact probabilities and mutual information values. Finally, InferenceQL aims to enable scalable Bayesian model criticism via posterior predictive checks, data quality screening via conditional probability calculation, fairness auditing via conditional probability ratios, and synthetic data generation to enhance privacy. These capabilities are accomplished using constructs that interleave standard database queries with Bayesian inference.
Keywords: trustworthy AutoML, probabilistic programming, SQL, Bayesian inference
One-sentence Summary: A probabilistic programming platform with SQL-like queries and automated Bayesian data modeling
Reproducibility Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
Reviewers: Ulrich Schaechtle, u.schaechtle@gmail.com
Main Paper And Supplementary Material: pdf
1 Reply

Loading