Abstract: The increasing volume and complexity of financial documents pose significant challenges for automated summarisation systems. Large language models (LLMs), while capable of handling long inputs, often struggle to maintain accuracy and coherence when summarising such lengthy and specialised documents. To address these limitations, we introduce PRAGSum, a cost-efficient, language-agnostic retrieval-augmented generation (RAG) system that leverages prototype-as-query retrieval to generate concise and coherent summaries of extended financial reports. In experiments on the Financial Narrative Summarisation (FNS) 2023 dataset, PRAGSum achieves state-of-the-art ROUGE-2 F-score of $0.28$. Additionally, we present SummQQ, a novel LLM-based evaluation framework that assesses summaries across five linguistic dimensions without the need for reference summaries. On the DUC 2007 dataset, SummQQ demonstrates a considerable improvement in correlation with human judgements over existing readability and fluency metrics, attaining an average Spearman's $\rho$ of $0.543$.
Paper Type: Long
Research Area: Summarization
Research Area Keywords: financial/business NLP; extractive summarisation; abstractive summarisation; multilingual summarisation; long-form summarization; few-shot summarisation; evaluation;
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English,Spanish,Greek
Submission Number: 4762
Loading