MTENCODER: A Multi-task Pretrained Transformer Encoder for Materials Representation Learning

Thorben Prein; Elton Pan; Tom Doerr; Elsa Olivetti; Jennifer L.M. Rupp

MTENCODER: A Multi-task Pretrained Transformer Encoder for Materials Representation Learning

Thorben Prein, Elton Pan, Tom Doerr, Elsa Olivetti, Jennifer L.M. Rupp

Published: 27 Oct 2023, Last Modified: 11 Dec 2023AI4Mat-2023 PosterEveryoneRevisionsBibTeX

Submission Track: Papers

Submission Category: AI-Guided Design

Keywords: Representation learning, Inorganic materials, Materials informatics

Supplementary Material: pdf

TL;DR: We present a multi-task learning framework for inorganic materials that outperforms structure-agnostic encoders and aids in materials discovery.

Abstract: Given the vast spectrum of material properties characterizing each compound, learning representations for inorganic materials is intricate. The prevailing trend within the materials informatics community leans towards designing specialized models that predict single properties. We introduce a \textit{multi-task} learning framework, wherein a transformer-based encoder is co-trained across diverse materials properties and a denoising objective, resulting in robust and generalizable materials representations. Our method not only improves over the performance observed in single-dataset pretraining, but also showcases scalability and adaptability toward multi-dataset pretraining. Experiments demonstrate that the trained encoder \textsc{MTEncoder} captures chemically meaningful representations, surpassing the performance of currrent structure-agnostic materials encoders. This approach paves the way to improvements in a multitude of materials informatics tasks, prominently including materials property prediction and synthesis planning for materials discovery.

Submission Number: 90

Loading