Improved Density Ratio Estimation for Evaluating Synthetic Data Quality

Published: 04 Mar 2025, Last Modified: 17 Apr 2025ICLR 2025 Workshop SynthDataEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Synthetic Data Evaluation, Density Ratio Estimation, Domain Adaptation, Kernel Methods
TL;DR: An aggregation algorithm to resolve parameter choice issues in density ratio estimation
Abstract: High-quality synthetic data is essential for accurate downstream analysis. Density Ratio Estimation (DRE) has emerged as a powerful tool for evaluating synthetic data quality. However, existing DRE methods are highly sensitive to hyperparameter selection, where suboptimal choices lead to poor convergence rates and degraded empirical performance. To mitigate this, we propose a novel model aggregation algorithm for DRE that trains multiple models with diverse hyperparameter configurations and combines their outputs. Our approach achieves fast convergence without requiring prior knowledge of the unknown density ratio smoothness and is minimax optimal for the squared loss. We demonstrate that our method enhances the performance of established DRE techniques across benchmark datasets, achieving state-of-the-art results on MiniDomainNet and Amazon Reviews.
Submission Number: 40
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview