Advancing Agricultural Decision-Making with A Multi-Dimensional Evaluation of Large Language Models for Sustainable Pest Management

Shanglong Yang, Zhipeng Yuan, Shunbao Li, Ruoling Peng, Kang Liu, Po Yang

Published: 2024, Last Modified: 23 Jun 2025INDIN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the rapidly evolving field of artificial intelligence, large language models (LLMs) have attracted much attention from researchers in various fields due to their unexpected text generation and comprehension capabilities. However, the applications of LLMs for sustainable pest management are under-explored due to the heavy reliance on specialized expert knowledge. In addition, evaluating the quality of LLMs' content is another technological challenge for applying LLMs in sustainable pest management. Therefore, we propose an instruction-based prompting method that integrates pest expert knowledge into the prompt, equipping LLMs with the necessary context to generate more accurate and relevant pest management advice. Furthermore, we propose an LLM-based evaluation framework to score the generated content on Coherence, Logical Consistency, Fluency, Relevance, Comprehension, and Exhaustion. Additionally, we integrate an Expert System based on crop threshold data as a baseline to obtain scores for Accuracy on whether pests found in crop fields should take management action. Each model's score is weighted by percentage to get a final score. The results show that GPT-3.5 and GPT-4 outperform the FLAN models in most evaluation dimensions. Furthermore, while using instruction-based prompting containing domain-specific knowledge outperforms other prompting methods with an accuracy of 72%, ongoing refinements and assessments of end-user satisfaction are essential to enhance the LLMs' effectiveness and practical helpfulness in providing pest management advice.