Investigating extrapolation and low-data challenges via contrastive learning of chemical compositions

Published: 27 Oct 2023, Last Modified: 11 Dec 2023AI4Mat-2023 SpotlightEveryoneRevisionsBibTeX
Submission Track: Papers
Submission Category: AI-Guided Design
Keywords: contrastive learning, chemical compositions, graph neural networks, extrapolation, property prediction
Supplementary Material: pdf
Abstract: Practical applications of machine learning for materials discovery remain severely limited by the quantity and quality of the available data. Furthermore, little is known about the ability of machine learning models to extrapolate outside of the training distribution, which is essential for the discovery of compounds with extraordinary properties. To address these challenges, we develop a novel deep representation learning framework for chemical compositions. The proposed model, named COmpositional eMBedding NETwork (CombNet), combines recent developments in graph-based encoding of chemical compositions with a supervised contrastive learning approach. This is motivated by the observation that contrastive learning can produce a regularized representation space from raw data, offering empirical benefits for extrapolation in low-data scenarios. Moreover, our method harnesses exclusively the chemical composition of the underlying materials, as crystal structure is generally unavailable before the material is discovered. We demonstrate the effectiveness of CombNet over state-of-the-art methods under a bespoke evaluation scheme that simulates a realistic materials discovery scenario with experimental data.
Submission Number: 19
Loading