Large Language Models are Zero Shot Hypothesis ProposersDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
Abstract: Scientific discovery drives human civilization, but the vast volume of knowledge hinders its progress. Large language models (LLMs) offer a promising opportunity to reshape knowledge interaction, yet their potential in knowledge discovery remains unexplored. In this paper, we investigate whether LLMs can propose new scientific hypotheses. Firstly, we construct a dataset consist of background knowledge and hypothesis pairs from biomedical literature, which is divided into training, seen, and unseen test sets based on the publication date to avoid data contamination. We subsequently evaluate the hypothesis generation capabilities of various top-tier instructed models in zero-shot, few-shot, and fine-tuning settings. Furthermore, drawing inspiration from uncertainty exploration in real-world scenarios, we have incorporated tool use and multi-agent interactions to augment uncertainty. We also design four metrics through a comprehensive review to evaluate the generated hypotheses for both LLM-based and human evaluations. Through experiments and analysis, we arrive at the following findings: 1) LLMs surprisingly generate untrained yet validated hypotheses from testing literature. 2) Increasing uncertainty facilitates candidate generation, potentially enhancing zero-shot hypothesis generation capabilities. These findings strongly support the potential of LLMs as catalysts for new scientific discoveries and guide further exploration.
Paper Type: long
Research Area: Resources and Evaluation
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: English
0 Replies

Loading