Generating Stable Materials with Large Language Model Reasoning and Reinforcement Learning

Zhang-Wei Hong; Nofit Segal; Aviv Netanyahu; Rafael Gomez-Bombarelli; Pulkit Agrawal

Generating Stable Materials with Large Language Model Reasoning and Reinforcement Learning

Zhang-Wei Hong, Nofit Segal, Aviv Netanyahu, Rafael Gomez-Bombarelli, Pulkit Agrawal

Published: 24 Sept 2025, Last Modified: 15 Oct 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Track 1: Original Research/Position/Education/Attention Track

Keywords: Large Language Models, Materials Generation

TL;DR: Large Language Model Reasoning with Reinforcement Learning Finetuning Boosts Generated Materials Stability

Abstract: Designing stable crystal structures is central to accelerating the discovery of new materials, yet most generative approaches remain limited to reproducing known patterns rather than exploring novel possibilities. We present a method that trains large language models with reinforcement learning guided by verifiable energy-based rewards, optimizing toward physically grounded stability objectives. Compared to supervised finetuning and base models, our reinforcement learning–trained model generates crystals with higher predicted stability and a greater proportion of previously unreported structures. These results suggest that combining verifiable energy rewards and reinforcement learning provides a powerful path toward automated discovery of novel, stable materials.

Submission Number: 496

Loading