Iterative Subset Selection for High-fidelity Synthetic Tabular Data

Published: 18 Nov 2025, Last Modified: 18 Nov 2025AITD@EurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Submission Type: Short paper (4 pages)
Keywords: tabular data, synthetic data generation
TL;DR: We present a method for improving synthetic tabular data.
Abstract: We present ISSOSYNTH, an iterative subset selection method for generating high-fidelity synthetic tabular data. The approach won the Mostly AI prize (2025) for generating the synthetic data with the highest fidelity. We evaluate the fidelity, utility and empirical privacy of the approach on various datasets and show that the method is able to improve fidelity and utility on downstream tasks without notably increasing vulnerability to membership inference attacks.
Relevance Comments: synthetic data generation for tabular data
Published Venue And Year: AI for Tabular Data workshop at EurIPS 2025
Submission Number: 44
Loading