Abstract: Generative modeling in machine learning aims to synthesize new data samples that are statistically similar to those observed during training. While conventional generative models such as GANs and diffusion models typically assume access to large and diverse datasets, many real-world applications (e.g., in medicine, satellite imaging, and artistic domains) operate under limited data availability and strict constraints. In this survey, we examine Generative Modeling under Data Constraint (GM-DC), which includes limited-data, few-shot, and zero-shot settings. We present a unified perspective on the key challenges in GM-DC, including overfitting, frequency bias, and incompatible knowledge transfer, and discuss how these issues impact model performance.
To systematically analyze this growing field, we introduce two novel taxonomies: one categorizing GM-DC tasks (e.g., unconditional vs. conditional generation, cross-domain adaptation, and subject-driven modeling), and another organizing methodological approaches (e.g., transfer learning, data augmentation, meta-learning, and frequency-aware modeling).
Our study reviews over 230 papers, offering a comprehensive view across generative model types and constraint scenarios. We further analyze task-approach-method interactions using a Sankey diagram and highlight promising directions for future work, including adaptation of foundation models, holistic evaluation frameworks, and data-centric strategies for sample selection.
This survey provides a timely and practical roadmap for researchers and practitioners aiming to advance generative modeling under limited data. Project website: https://anonymous4mysubmission.github.io/gmdc-survey/.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=47zW24uukd&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: - We have added a new section (Sec. 6) to include 1) critical analysis and design principles of approaches (in Sec. 6.1), and 2) empirical comparison including qualitative and quantitative comparison across tasks (in Sec. 6.2), as suggested by the reviewer.
- We have updated Sec. 3.2 to clarify the definition of data constraint ranges, as suggested by the reviewer.
- We have added Sec. 1.1 to discuss the paper selection and search strategy of our survey, following the reviewer’s recommendation.
- We have updated Sec. 2 *Related Work* to discuss general generative modeling, as suggested by the reviewer.
- We have added Sec. 7.2 to discuss the progress trends across approach categories, following the reviewer’s recommendations.
Assigned Action Editor: ~Gabriel_Loaiza-Ganem1
Submission Number: 5456
Loading