Keywords: GAN, Generative Adversarial Network, Synthetic Data, Utility, Feedback
TL;DR: we propose an architecture called DownStream Feedback Generative Adversarial Network (DSF-GAN). Which uses feedback from a downstream prediction model mid-training, to add valuable information to the generator’s loss function
Abstract: Utility and privacy are two crucial measurements of synthetic tabular data. While privacy measures have been dramatically improved with the use of Generative Adversarial Networks (GANs), generating high-utility synthetic samples remains challenging. To increase the samples' utility, we propose a novel architecture called DownStream Feedback Generative Adversarial Network (DSF-GAN). This approach uses feedback from a downstream prediction model mid-training, to add valuable information to the generator’s loss function. Hence, DSF-GAN harnesses a downstream prediction task to increase the utility of the synthetic samples. To properly evaluate our method, we tested it using two popular data sets. Our experiments show better model performance when training on DSF-GAN-generated synthetic samples compared to synthetic data generated using the same GAN architecture without feedback when evaluated on the same validation set comprised of real samples. All code and datasets used in this research are openly available for ease of reproduction.
Supplementary Material: zip
Submission Number: 106
Loading