everyone
since 09 Apr 2025">EveryoneRevisionsBibTeXCC BY 4.0
Discovering novel materials is critical for technological advancements such as solar cells, batteries, and carbon capture. However, the development of new materials is constrained by a slow and expensive trial-and-error process. To accelerate this pipeline, we introduce PLaID, a Large Language Model (LLM) fine-tuned for stable crystal generation. We first fine-tune a base version of LLaMA-2 7B on Wyckoff-based text representations of crystals. Then, we further fine-tune via Direct Preference Optimization on sampled structures categorized by their stability. By encoding symmetry constraints directly into text and aligning model outputs to explore stable chemical space, PLaID generates structures that are thermodynamically stable, unique, and novel at a 40% higher rate than prior methods. Our work demonstrates the potential of adapting post-training techniques from natural language processing to materials design, paving the way for targeted and efficient discovery of novel materials.