Submission Track: Findings & Open Challenges (Tiny Paper)
Submission Category: AI-Guided Design + Automated Material Characterization
Keywords: Synthetic Data, Data Flywheel, Conditional Generative Model, Graph Neural Network, Material Property Prediction
Abstract: Data scarcity and the high cost of annotation have long been persistent challenges in the field of materials science. Inspired by its potential in other fields like computer vision, we propose the MatWheel framework, which iteratively train the material property prediction model using the synthetic data generated by the conditional generative model. We explore two scenarios: fully-supervised and semi-supervised learning. Using CGCNN for property prediction and Con-CDVAE as the conditional generative model, experiments on six data-scarce material property datasets from Matminer database are conducted. Results show that synthetic data has potential in extreme data-scarce scenarios, achieving performance close to or exceeding that of real samples in all six tasks. We also find that pseudo-labels have little impact on generated data quality. Future work will integrate advanced models and optimize generation conditions to boost the effectiveness of the materials data flywheel.
Submission Number: 19
Loading