Abstract: Despite the impressive ability offered by pre-trained language models (PLMs) and large language models (LLMs), these models still face the challenges of the long-text summarization (LTS) task. On the one hand, due to the max input-length, PLMs adopt simple truncation or salience analysis for extracting part of sentences from the original input, leading to possible information loss. On the other hand, despite no limits on input length, the summaries generated by LLMs often introduce irrelevant information, leading to great gap between prediction and reference. To this end, this paper proposes a hybrid long-text summarization model, PLLM-TS (Leveraging PLMs and LLMs for Long-Text Summarization), which combines the advantages of both PLMs and LLMs. PLLM-TS is a pipeline composed of two modules. The first module is an extractive summarization model (e.g., BERTSum) enhanced with combined attention ideas. And the second module is a switching generative summarization model, dependent on the number of extracted sentences from the first module. Specifically, when the sentence number is bigger, we apply a PLM-based summarization model (e.g., Pegasus) to generate a summary from extracted sentences. By contrast, when the sentence number is smaller, we utilize a LLM-based summarization model (e.g., GPT-2) to rewrite extracted sentences into a shorter and more concise summary. We experiment with two public academic-paper datasets: arXiv and PubMed, showing the effectiveness and superiority of our method.
Loading