SURVEYFORGE: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing
Abstract: Survey paper plays a crucial role in scientific
research, especially given the rapid growth of
research publications. Recently, researchers
have begun using LLMs to automate survey
generation for better efficiency. However, the
quality gap between LLM-generated surveys
and those written by human remains significant, particularly in terms of outline quality
and citation accuracy. To close these gaps, we
introduce SURVEYFORGE, which first generates the outline by analyzing the logical structure of human-written outlines and referring
to the retrieved domain-related articles. Subsequently, leveraging high-quality papers retrieved from memory by our scholar navigation
agent, SURVEYFORGE can automatically generate and refine the content of the generated
article. Moreover, to achieve a comprehensive
evaluation, we construct SurveyBench, which
includes 100 human-written survey papers for
win-rate comparison and assesses AI-generated
survey papers across three dimensions: reference, outline, and content quality. Experiments
demonstrate that SURVEYFORGE can outperform previous works such as AutoSurvey.
Loading