Keywords: Large Language Model, LLM Agent, Automated Scientific Discovery, AI Scientist
Abstract: Novel research ideas play a critical role in advancing scientific inquiries. Recent advancements in Large Language Models (LLMs) have demonstrated their potential to generate novel research ideas by leveraging large-scale scientific literature. However, previous work in research ideation has primarily relied on simplistic methods, such as keyword co-occurrence or semantic similarity. These approaches focus on identifying statistical associations in the literature but overlook the complex, contextual relationships between scientific concepts, which are essential to effectively leverage knowledge embedded in human literature. For instance, papers that simultaneously mention "keyword A" and "keyword B" often present research ideas that integrate both concepts. Additionally, some LLM-driven methods propose and iteratively enhance research ideas using the model's vast internal knowledge, but they fail to effectively leverage the valuable scientific concept network, limiting the grounding of these ideas in established research. To address these challenges, we propose the \textbf{Deep Ideation} framework, which integrates a scientific network that not only captures keyword co-occurrence but also incorporates contextual relationships between keywords, providing a richer scientific foundation for LLM-driven ideation. Our framework introduces an explore-expand-evolve workflow for Deep-Ideation which integrates several key components to iteratively refine research ideas. Throughout this workflow, we maintain an Idea Stack to track research progress across iterations. To guide this search and evolution process, we integrate a critic engine trained on real-world reviewer feedback, providing continuous signals on the novelty and feasibility of generated ideas. Experimental results across multiple AI domains show that our approach significantly improves the overall quality of generated ideas by \textbf{10.67\%} compared to other methods, with the generated ideas exceeding the acceptance level of top conferences. Human evaluation highlights the practical value of the generated ideas in supporting scientific research while ablation studies further confirm the effectiveness of each component of the workflow.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 20167
Loading