Benchmark of Machine Learning Force Fields for Semiconductor Simulations: Datasets, Metrics, and Comparative Analysis
Keywords: Machine Learning Force Field (MLFF), Semiconductor datasets, MLFF Benchmark
TL;DR: In this benchmark, we introduce two novel semiconductor datasets (SAMD23) and conduct a thorough performance evaluation of various MLFF models, considering energy and force errors as well as simulation-derived metrics
Abstract: As semiconductor devices become miniaturized and their structures become more complex, there is a growing need for large-scale atomic-level simulations as a less costly alternative to the trial-and-error approach during development. Although machine learning force fields (MLFFs) can meet the accuracy and scale requirements for such simulations, there are no open-access benchmarks for semiconductor materials. Hence, this study presents a comprehensive benchmark suite that consists of two semiconductor material datasets and ten MLFF models with six evaluation metrics. We select two important semiconductor thin-film materials silicon nitride and hafnium oxide, and generate their datasets using computationally expensive density functional theory simulations under various scenarios at a cost of 2.6k GPU days. Additionally, we provide a variety of architectures as baselines: descriptor-based fully connected neural networks and graph neural networks with rotational invariant or equivariant features. We assess not only the accuracy of energy and force predictions but also five additional simulation indicators to determine the practical applicability of MLFF models in molecular dynamics simulations. To facilitate further research, our benchmark suite is available at https://github.com/SAITPublic/MLFF-Framework.
Supplementary Material: zip
Submission Number: 316