Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: Foundation Models, Material Discovery, Batteries, Devices, Prediction, Benchmarking, Empirical Datasets, Scarce Data, Scientific Discovery, Electrolytes
Abstract: Recent years have seen fast emergence and adoption of chemical foundation models in computational material science for property prediction and generation tasks that are focused mostly on small molecules or crystals. Despite these paradigm shifts, integration of new discovered materials in real world devices continues to be a challenge due to design problems. New candidate material must be optimized to achieve compatibility with other components of the device to attain the target performance. Chemical foundation model benchmarks must evaluate their scope in predicting macro scale outcomes that are the result of chemical interactions in multivariate design space. This study evaluates performance of chemical foundation model, pre-trained with 91 million SMILES of small molecules, in extrapolating learning from molecules to material design challenges across multiple length scale in batteries. The base model is fine-tuned using ten datasets covering molecular structures, formulations, and battery device measurements, and its performance is benchmarked against conventional molecular representations such as Morgan Fingerprints. The study further examines the model’s capacity to generalize to out-of-distribution (OOD) cases by quantifying prediction errors for novel material designs that differ substantially from the training data. Finally, the interpretability of the resulting models is assessed, with the aim of enabling researchers to apply them selectively for design interpretation within regions of chemical space where prediction confidence can be reasonably established.
Submission Number: 240
Loading