A Crystal Knowledge-Enhanced Pre-training Framework for Crystal Property Estimation

Published: 2024, Last Modified: 25 Jan 2025ECML/PKDD (10) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The design of new crystalline materials, or simply crystals, with desired properties relies on the ability to estimate the properties of crystals based on their structure. To advance the ability of machine learning (ML) to enable property estimation, we address two key limitations. First, creating labeled data for training entails time-consuming laboratory experiments and physical simulations, yielding a shortage of such data. To reduce the need for labeled training data, we propose a pre-training framework that adopts a mutually exclusive mask strategy, enabling models to discern underlying patterns. Second, crystal structures obey physical principles. To exploit the principle of periodic invariance, we propose multi-graph attention (MGA) and crystal knowledge-enhanced (CKE) modules. The MGA module considers different types of multi-graph edges to capture complex structural patterns. The CKE module incorporates periodic attribute learning and atom-type contrastive learning by explicitly introducing crystal knowledge to enhance crystal representation learning. We integrate these modules in a CRystal knOwledge-enhanced Pre-training (CROP) framework. Experiments on eight different datasets show that CROP is capable of promising estimation performance and can outperform strong baselines.
Loading