GANBLR++: Incorporating Capacity to Generate Numeric Attributes and Leveraging Unrestricted Bayesian Networks
Abstract: Generative Adversarial Networks (GAN) models have led to a major breakthrough in data generation of various sorts. Over the years, we have seen several applications of GAN-based learning for tabular data generation as well. Very recently, GAN-based learning by incorporating Bayesian Networks (BN) as generator and discriminator – GANBLR, has shown to lead to state-of-the-art (SOTA) results for tabular data generation. Despite the impressive performance, GANBLR has an inherent weakness that it can only generate data with categorical attributes. Additionally, the model is trained and tested only with a restricted Bayesian Network. In this work, we have proposed an extension over GANBLR framework – GANBLR++, that has the capacity to generate numeric attributes, by leveraging Dirichlet Mixture Model. We also leverage unrestricted BN in GANBLR framework, and discuss how the use of unrestricted BN can lead to better quality data, as well as more interpretable model. We evaluate the effectiveness of GANBLR++ on wide range of datasets by demonstrating that it generates data of better quality as compared to existing SOTA models for tabular (numeric and categorical) data generation such as CTGAN, MedGAN and TableGAN.
0 Replies
Loading