Active Learning for Flow Matching Model in Shape Design: A Perspective from Continuous Condition Dataset

Active Learning for Flow Matching Model in Shape Design: A Perspective from Continuous Condition Dataset

ICLR 2026 Conference Submission17562 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Flow Matching Model, Active Learning, Shape Design, Generative Model

TL;DR: This work establishes a foundation for active learning in flow matching models by first theoretically characterizing the impact of data on diversity and accuracy, and then translating this understanding into targeted query strategies.

Abstract: Although the flow matching model has demonstrated powerful capabilities in modern machine learning, its training notoriously relies on an incredibly large scale of high-quality labeled samples. Nevertheless, the acquisition of high-quality labeled datasets is hindered by exorbitant labeling costs in certain fields, notably medical imaging and numerical simulation. Therefore, selecting the most informative samples for training at minimal cost poses a key challenge in these fields. This issue constitutes a central topic in active learning, a subfield of machine learning dedicated to maximizing model performance while minimizing annotation cost. The central challenge involves developing an optimal query strategy to acquire the most informative data samples with minimal labeling effort. This paper presents a pilot study that investigates the application of active learning, which traditionally explored within the context of discriminative models, to flow matching models. By analyzing flow matching models through a piecewise-linear neural network framework, this work elucidates how individual data points influence the diversity and accuracy of the model. Leveraging this analytical framework, we propose two distinct query strategies: one aimed at enhancing model diversity, and the other designed to improve model accuracy. We demonstrate that these two strategies are inherently conflicting, providing a partial explanation for the fundamental trade-off between diversity and accuracy in flow matching models from a dataset perspective. Furthermore, we introduce a mixed strategy that combines both strategies through a weighted mechanism, enabling adjustable control over the diversity-accuracy trade-off by tuning the corresponding weights. Extensive experiments validate the effectiveness of our approach, showing that the proposed query strategies outperform those designed for discriminative models.

Primary Area: generative models

Submission Number: 17562

Loading