BoChemian: Large Language Model Embeddings for Bayesian Optimization of Chemical Reactions

Published: 27 Oct 2023, Last Modified: 22 Dec 2023RealML-2023EveryoneRevisionsBibTeX
Keywords: Bayesian Optimization, Gaussian processes, Large Language Models, Chemical Reaction Optimization
Abstract: This paper explores the integration of Large Language Models (LLM) embeddings with Bayesian Optimization (BO) in the domain of chemical reaction optimization with the showcase study on Buchwald-Hartwig reactions. By leveraging LLMs, we can transform textual chemical procedures into an informative feature space suitable for Bayesian optimization. Our findings show that even out-of-the-box open-source LLMs can map chemical reactions for optimization tasks, highlighting their latent specialized knowledge. The results motivate the consideration of further model specialization through adaptive fine-tuning within the bo framework for on-the-fly optimization. This work serves as a foundational step toward a unified computational framework that synergizes textual chemical descriptions with machine-driven optimization, aiming for more efficient and accessible chemical research. The code is available at:
Submission Number: 67