IIR: Enhancing LLMs in Multi-Document Scientific Summarization via Iterative Introspection based Refinement

IIR: Enhancing LLMs in Multi-Document Scientific Summarization via Iterative Introspection based Refinement

ACL ARR 2025 May Submission2317 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Current Multi-Document Scientific Summarization (MDSS) research suffers from a significant gap between task formulations and practical applications. The emergence of Large Language Models (LLMs) provides us with an opportunity to bridge this gap from a more practical perspective. To this end, we redefine MDSS task based on the scenario that automatically generates the entire related work section, and construct ComRW, a new dataset aligned with this practical scenario. We first conduct a comprehensive evaluation of the performance of different LLMs on the newly defined task, and reveal three common deficiencies in their ability to address MDSS task: low coverage of reference papers, disorganized structure, and high redundancy. To alleviate these three deficiencies, we propose an Iterative Introspection based Refinement (IIR) method that utilizes LLMs to generate higher-quality summaries. The IIR method uses prompts equipped with Chain-of-Thought and fine-grained operators to treat LLMs as an evaluator and a generator to evaluate and refine the three deficiencies, respectively. We conduct thorough automatic and human evaluation to validate the effectiveness of IIR. The results demonstrate that IIR can effectively mitigate the three deficiencies and improve the quality of summaries generated by different LLMs. Our work not only presents an effective MDSS solution, but also offers unique insights into addressing the inherent deficiencies of LLMs in real-world MDSS applications.

Paper Type: Long

Research Area: Summarization

Research Area Keywords: abstractive summarisation, multi-document summarization, few-shot summarisation

Contribution Types: NLP engineering experiment, Approaches to low-resource settings

Languages Studied: English

Submission Number: 2317

Loading