Can LLMs Reason Like Scientists? A Survey on Hypothesis Generation

Can LLMs Reason Like Scientists? A Survey on Hypothesis Generation

ACL ARR 2025 July Submission1359 Authors

29 Jul 2025 (modified: 03 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Can machines reason like scientists? Scientific hypothesis generation—the process of formulating testable explanations for observed phenomena—remains the most critical bottleneck in accelerating scientific discovery. While recent advances in Large Language Models (LLMs) show promise for automating hypothesis generation, the field lacks a systematic understanding of their capabilities, limitations, and optimal application strategies. In this survey, we explore the emerging landscape of LLM-driven hypothesis generation. We present a structured taxonomy of current approaches, analyse domain-specific datasets and evaluation strategies, and discuss open challenges. We review 37 papers spanning diverse scientific domains from 2023 to 2025. Overall, our goal is to clarify the state of the art, motivate further interdisciplinary research, and provide practical guidance through a continuously updated GitHub repository of relevant papers and resources.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Scientific hypothesis generation, Large Language Models, Automated discovery, Scientific reasoning

Contribution Types: Surveys

Languages Studied: English

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: N/A

B Use Or Create Scientific Artifacts: No

B1 Cite Creators Of Artifacts: N/A

B2 Discuss The License For Artifacts: N/A

B3 Artifact Use Consistent With Intended Use: N/A

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B5 Documentation Of Artifacts: N/A

B6 Statistics For Data: N/A

C Computational Experiments: No

C1 Model Size And Budget: N/A

C2 Experimental Setup And Hyperparameters: N/A

C3 Descriptive Statistics: N/A

C4 Parameters For Packages: N/A

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: Yes

E1 Information About Use Of Ai Assistants: No

Author Submission Checklist: yes

Submission Number: 1359

Loading