Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering

Hongxuan Liu; Haoyu Yin; Zhiyao Luo; Xiaonan Wang

Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering

Hongxuan Liu, Haoyu Yin, Zhiyao Luo, Xiaonan Wang

Published: 17 Jun 2024, Last Modified: 17 Jul 2024ICML2024-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: AI for Science; Large Language Model; Prompt Engineering; Domain-Knowledge Chain-of-Thought.

TL;DR: Our work presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in 3 chemical and biological domains: organic small molecules, enzymes and crystal materials.

Abstract: Introduction and Background\ The rapid integration of artificial intelligence (AI) into natural sciences has significantly propelled its integration into natural science, specifically biology, chemistry and material science. Traditional AI applications in science focused on properties predictions, but faced limitations with known molecules or materials. The advent of LLMs, capable of zero-shot reasoning and incorporating domain knowledge, presents a new avenue for scientific discovery. Yet, the quality of prompts significantly influences LLM outputs, and a significant limitation of these prompt engineering methods is that they do not incorporate domain expertise as guidance for problem-solving, considerably restricting the capabilities of LLMs in numerous domain-specific tasks. This study addresses the gap in LLM’s applications in tasks for chemistry, biology and materials science by introducing domain-specific prompt engineering framework to enhance LLM applicability in these fields. Methodology\ In task construction process, we collect and curate a comprehensive dataset of 1280 questions and corresponding solutions for the evaluation of LLM’s capability. The tasks involve three domains: organic small molecules, enzymes, and crystal materials, which hold significant relevance in academic research and practical applications . In conjunction with this dataset, a common and open-source LLM plugged-in automatic benchmarking scheme is also developed to utilize several metrics including capability, accuracy, F1 score, and hallucination drop for LLM’s domain-specific performance evaluation. Formulating domain-specific scientific prediction as LLM question answering tasks, we propose a domain-knowledge embedded prompt engineering strategy that draws on both heuristics of generic prompt engineering methods like few-shot prompting, chain-of-thought prompting and expert prompting, as well as domain knowledge incorporation, which essentially involves integrating the thought processes of chemistry/biology experts to offer precise background knowledge and exemplify accurate human reasoning to LLM. The prompting scheme takes the form of multi-expert mixture. Each expert takes part in role playing and are given a few shots of CoT demonstrations integrated with expertise domain knowledge or instructions. Then the experts' answers would be assembled through the principle of "minority submission to the majority". Results\ The domain-knowledge embedded prompt engineering method outperforms other techniques in almost all tasks related to small molecules and crystal materials, and over half of the tasks on enzymes. Across different task types and complexities, the domain-knowledge approach consistently outperforms general methods, especially in tasks requiring complex reasoning or intricate knowledge of experimental data. This highlights the model's adeptness at navigating and synthesizing domain-specific information to generate more accurate and relevant responses. Moreover, when examining the performance across various materials, it’s evident that the tailored prompts significantly enhance the LLM's ability to process and analyze data from distinct scientific domains. Case Studies\ Three pivotal case studies are chosen due to their significant implications in scientific research and industry applications. These materials, MacMillan's catalyst, paclitaxel, and lithium cobalt oxide, represent cornerstone discoveries in their respective fields, each posing unique challenges and opportunities for exploration through AI-assisted methodologies. The MacMillan catalyst, a Nobel-recognized innovation in organocatalysis , is studied for its potential to revolutionize synthetic chemistry through enhanced catalytic processes. The task involves dissecting the catalyst's complex structure and predicting its reactivity and selectivity, a testament to the model's capability to navigate intricate chemical landscapes. Paclitaxel, a key agent in cancer therapy , is explored for optimizing its synthesis pathway, highlighting the model's ability to contribute to pharmaceutical advancements by streamlining synthetic routes for complex molecules. Lastly, lithium cobalt oxide, essential in lithium-ion battery technology , is examined for its crystallographic properties and electrochemical behavior. Conclusion\ The integration of domain-specific knowledge into LLMs through prompt engineering offers significant improvements in performance across various scientific tasks. This approach not only makes LLMs more applicable in specialized areas but also highlights their potential as powerful tools for scientific discovery and innovation. The study also outlines future directions, including expanding domain coverage, integrating datasets and tools, and developing multi-modal prompting techniques.

Submission Number: 124

Loading