Data Distillation for Neural Network Potentials toward Foundational Dataset

Published: 27 Oct 2023, Last Modified: 07 Nov 2023AI4Mat-2023 PosterEveryoneRevisionsBibTeX
Submission Track: Findings
Submission Category: AI-Guided Design
Keywords: Neural Network Potential, Active Learning, Enhanced Sampling, Data Distillation, Knowledge Transfer
Supplementary Material: pdf
TL;DR: Efficient sampliing configurations and data distillation for neural network potentials
Abstract: Machine learning (ML) techniques and atomistic modeling have rapidly transformed materials design and discovery. Specifically, generative models can swiftly propose promising materials for targeted applications. However, the predicted properties of materials through the generative models often do not match with calculated properties through ab initio calculations. This discrepancy can arise because the generated coordinates are not fully relaxed, whereas the many properties are derived from relaxed structures. Neural network-based potentials (NNPs) can expedite the process by providing relaxed structures from the initially generated ones. Nevertheless, acquiring data to train NNPs for this purpose can be extremely challenging as it needs to encompass previously unknown structures. This study utilized extended ensemble molecular dynamics (MD) to secure a broad range of liquid- and solid-phase configurations in one of the metallic systems, nickel. Then, we could significantly reduce them through active learning without losing much accuracy. We found that the NNP trained from the distilled data could predict different energy-minimized closed-pack crystal structures even though those structures were not explicitly part of the initial data. Furthermore, the data can be translated to other metallic systems (aluminum and niobium) without repeating the sampling and distillation processes. Our approach to data acquisition and distillation has demonstrated the potential to expedite NNP development and enhance materials design and discovery by integrating generative models.
Digital Discovery Special Issue: Yes
Submission Number: 5
Loading