Cross Modal Predictive architecture for Material Property prediction

Published: 20 Sept 2025, Last Modified: 29 Oct 2025AI4Mat-NeurIPS-2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal learning, Material Informatics, Self Supervised Learning, AI Guided Design
TL;DR: X-MoPA learns material properties by predicting one modality (graph, text, or XRD) from the other two in a shared latent space, enabling efficient cross-modal representation learning and state-of-the-art performance.
Abstract: In this work, we propose CrossModal Predictive Architecture(X-MoPA), a multimodal learning model that combines crystal structure graphs, X ray diffraction (XRD) patterns, and text based structural descriptions to improve materials property prediction. Unlike prior multimodal approaches that rely on heavy attention mechanisms or simple concatenation, X-MoPA leverages lightweight predictors to learn a joint latent space through cross-modal prediction. For each training instance, we select two modalities and predict the third one in latent space. This formulation captures complementary information across modalities while avoiding reconstruction inefficiencies and contrastive memory bottlenecks. We train and evaluate the model on Matbench for several key properties, Band Gap, Shear Modulus, Bulk Modulus and formation energy for Perovskites. X-MoPA consistently outperforms state of the art(SOTA) models, with error reductions ranging from 16% to 60% across four key properties, while matching the best baseline on Shear Modulus. Beyond Matbench, X-MoPA achieves SOTA performance on AFLOW band gap prediction, showing that the learned cross-modal representations transfer well across datasets with different sampling strategies and property distributions.
Submission Track: Paper Track (Short Paper)
Submission Category: AI-Guided Design
Institution Location: Montreal, Quebec, Canada
AI4Mat Journal Track: Yes
AI4Mat RLSF: Yes
Submission Number: 115
Loading