How Facility is the Small-Scale Abstractive Summarization Model: A Quantitative Study of Semantics and Syntax

How Facility is the Small-Scale Abstractive Summarization Model: A Quantitative Study of Semantics and Syntax

ACL ARR 2024 June Submission2696 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large-scale language models (LLMs) have demonstrated advancements in numerous capabilities, including factual consistency in abstractive summarization. However, the benefits of straightforward deployment and reduced invocation latency for small-scale language models (SLMs) should not be disregarded. Current evaluation metrics merely provide an abstract indication of factual score differences, leaving us uncertain about the specific areas where SLMs underperform and whether this gap is tolerable in certain contexts. This study initially illustrates the disparities between LLMs and SLMs regarding semantic knowledge and syntactic ability. Subsequently, we propose an SLM based on contrastive learning that allows tailored semantic and syntactic information and generates a parallel corpus with diverse summaries for the same document, each containing subtle semantic or syntactic flaws. By comprehensively integrating eight distinct factual evaluation metrics, we further elucidate the meaning of the gap in factual scores and identify the primary factual challenges current SLMs face in the abstractive summarization task.

Paper Type: Long

Research Area: Summarization

Research Area Keywords: factuality;

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 2696

Loading