Iterative Introspection based Refinement: Boosting Multi-Document Scientific Summarization with Large Language Models

ACL ARR 2024 June Submission1878 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The current task setting and constructed datasets for Multi-Document Scientific Summarization (MDSS) have led to a significant gap between existing research and practical applications. However, the emergence of Large Language Models (LLMs) provides us with an opportunity to address MDSS from a more practical perspective. To this end, we redefine MDSS task based on the scenario that automatically generates the entire related work section, and then construct a corresponding new dataset, ComRW. We first conduct a comprehensive evaluation of the performance of LLMs on the newly defined task, and identify three major deficiencies in their ability to address MDSS task: low coverage of reference papers, disorganized structure, and high redundancy. To alleviate these three deficiencies, we propose an Iterative Introspection based Refinement (IIR) method that utilizes LLMs to generate higher-quality summaries. The IIR method uses prompts equipped with Chain-of-Thought and fine-grained operators to treat LLMs as an evaluator and a generator to evaluate and refine the three deficiencies, respectively. We conduct thorough automatic and human evaluation to validate the effectiveness of our method. The results demonstrate that the proposed IIR method can effectively mitigate the three deficiencies and improve the quality of summaries generated by LLMs. Moreover, our exploration provides insights for better addressing MDSS task with LLMs.
Paper Type: Long
Research Area: Summarization
Research Area Keywords: long-form summarization,multi-document summarization
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data resources
Languages Studied: English
Submission Number: 1878
Loading