LongEval: A Comprehensive Analysis on Long-Text\\ Generation Through Plan-based Paradigm

LongEval: A Comprehensive Analysis on Long-Text\\ Generation Through Plan-based Paradigm

ACL ARR 2025 February Submission2587 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) have achieved remarkable success in various natural language processing tasks, yet their ability to generate long-form content remains poorly understood and evaluated. Our analysis reveals that current LLMs struggle with length requirements and information density in long-text generation, with performance deteriorating as text length increases. To quantitively locate such a performance degradation and provide further insights on model development, we present \textbf{LongEval}, a benchmark that evaluates long-text generation through both \textit{direct} and \textit{plan-based} generation paradigms, inspired by cognitive and linguistic writing models. The comprehensive experiments in this work reveals interesting findings such as that while model size correlates with generation ability, the small-scale model (e.g., LongWriter), well trained on long texts, has comparable performance.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Long text generation

Contribution Types: Model analysis & interpretability, Data resources, Data analysis

Languages Studied: English

Submission Number: 2587

Loading