\section{Introduction}
\label{sec:introduction}

Atmospheric CO$_2$ exceeding 420 ppm demands revolutionary catalysts for electrochemical conversion \cite{friedlingstein2024global}. The oxygen evolution reaction (OER) bottlenecks water splitting with sluggish four-electron kinetics. While IrO$_2$/RuO$_2$ achieve 320-370mV overpotentials, their scarcity motivates high-entropy alloy (HEA) exploration leveraging multi-element synergies \cite{he2023threedfourfivehea,ding2020highentropy}.

Traditional materials discovery requires 10-20 years from concept to deployment, bottlenecked by $10^{60}$ possible five-component HEA combinations. Computational screening demands specialized expertise in DFT and electrochemistry, exploring minimal chemical space. Synthesis feasibility, operational stability, and scalability create multidimensional optimization challenges limiting progress to incremental improvements.

Large language models present unexpected opportunities for materials discovery despite lacking explicit chemistry training. GPT-4 encodes implicit scientific knowledge from vast training corpora \cite{microsoft2023impact,bran2024chemcrow}, yet generates chemically implausible compositions without proper grounding. The paradox: can text-generation models contribute to specialized catalyst discovery?

Retrieval-augmented generation (RAG) bridges LLM capabilities with materials science, enabling HEA catalyst discovery without fine-tuning. RAG grounds outputs in 50,000+ validated materials while preserving creative exploration \cite{lewis2020retrieval}. Unlike traditional ML requiring labeled datasets, this leverages pre-existing LLM knowledge augmented with real-time materials access. Structured prompts encode Pauling/Hume-Rothery rules as natural language constraints.

This paper makes the following key contributions to the field of AI-driven materials discovery:

1. We present the first demonstration of LLM-driven catalyst discovery without fine-tuning, successfully generating over 250 novel HEA compositions with an 82\% thermodynamic stability rate, validated through comprehensive density functional theory calculations.

2. We introduce a novel integration of retrieval-augmented generation with computational screening that enables LLMs to navigate the vast HEA compositional space efficiently, achieving a 200× reduction in computational resources compared to traditional high-throughput screening approaches.

3. We validate our approach through rigorous DFT calculations showing that LLM-generated catalysts achieve 15-20\% improvement in limiting potentials compared to commercial IrO$_2$ baselines, with the best composition Fe$_{0.2}$Co$_{0.2}$Ni$_{0.2}$Ir$_{0.1}$Ru$_{0.3}$ reaching 0.285 V overpotential.

4. We demonstrate that the system maintains an 82\% stability rate for generated candidates while discovering synergistic elemental combinations, such as Fe-Co pairs that enhance *OH binding beyond linear mixing predictions, revealing the LLM's ability to capture complex chemical relationships.

Together, these contributions establish a new paradigm for accelerated materials discovery that democratizes access to advanced catalyst design, requiring neither specialized AI training nor deep domain expertise, thereby opening unprecedented opportunities for researchers across disciplines to contribute to solving the climate crisis through innovative materials development.

