Investigating extrapolation and low-data challenges via contrastive learning of chemical compositions
Submission Track: Papers
Submission Category: AI-Guided Design
Keywords: contrastive learning, chemical compositions, graph neural networks, extrapolation, property prediction
Supplementary Material: pdf
Abstract: Practical applications of machine learning for materials discovery remain severely limited by the quantity and quality of the available data. Furthermore, little is known about the ability of machine learning models to extrapolate outside of the training distribution, which is essential
for the discovery of compounds with extraordinary properties. To address these challenges, we develop a novel deep representation learning framework for chemical compositions. The proposed model, named COmpositional eMBedding NETwork (CombNet), combines recent developments in graph-based encoding of chemical compositions with a supervised contrastive learning approach. This is motivated by the observation that contrastive learning can produce a regularized representation space from raw data, offering empirical benefits for extrapolation in low-data scenarios. Moreover, our method harnesses exclusively the chemical composition of the underlying materials, as crystal structure is generally unavailable before the material is discovered. We demonstrate the effectiveness of CombNet over state-of-the-art methods under a bespoke evaluation scheme that simulates a realistic materials discovery scenario with experimental data.
Submission Number: 19
Loading