Towards Generating Stable Materials via Large Language Models with Reinforcement Learning Finetuning

Published: 24 Sept 2025, Last Modified: 26 Dec 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Additional Submission Instructions: For the camera-ready version, please include the author names and affiliations, funding disclosures, and acknowledgements.
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: Materials Generation, Reinforcement Learning, Large Language Models
TL;DR: Reinforcement learning finetuning of large language models for generating stable materials
Abstract: Discovering novel materials is essential for advancing technology, yet generating thermodynamically stable crystal structures remains a significant challenge due to the difficulty of directly steering generative models toward physically realistic structures. We investigate the impact of reinforcement learning (RL) finetuning of Large Language Models (LLMs) for crystal structure generation using energy-based rewards. Our results show that RL-finetuning improves the rate of generating metastable crystals compared to supervised finetuning (SFT) and performs comparably to established diffusion-based baselines. Notably, the RL-steered model produces structures significantly closer to their relaxed states, which potentially reduces the computational overhead of downstream structural optimization. Future efforts may build upon these results by investigating reward formulations better aligned with thermodynamic stability and exploring methods to maintain structural variety during optimization.
Submission Number: 496
Loading