\begin{abstract}
We demonstrate that large language models (LLMs) can effectively discover high-entropy alloy (HEA) catalysts when augmented with retrieval-based grounding from materials databases. Our framework combines GPT-4 with a 50,000+ entry materials database to generate and validate novel catalyst compositions. The approach discovered 250+ candidates with 82\% thermodynamic stability, including Fe$_{0.2}$Co$_{0.2}$Ni$_{0.2}$Ir$_{0.1}$Ru$_{0.3}$ achieving 0.285V overpotential—25\% better than IrO$_2$. Experimental validation of 10 candidates confirms DFT predictions within 20\% accuracy, with synthesis via arc melting at 1650-1800°C yielding single-phase materials showing <5\% degradation over 1000 cycles. Compared to graph neural networks and active learning approaches, our method achieves 200× computational efficiency while maintaining comparable discovery rates. The framework extends to HER and CO$_2$RR applications and operates effectively with open-source LLMs (LLaMA-2, Mistral) at 70\% performance of GPT-4. We identify key success factors: implicit chemical knowledge in pre-trained models, RAG preventing hallucinations, and iterative refinement incorporating DFT feedback. This work establishes LLM-based materials discovery as a practical alternative to traditional high-throughput screening.
\end{abstract}