Do Language Models Understand Human Needs on Text Summarization?

ACL ARR 2024 June Submission1348 Authors

14 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: With the popularity of large language models and their high-quality text generation capabilities, researchers are using them as auxiliary tools for text summary writing. Although summaries generated by these large language models are smooth and capture key information sufficiently, the quality of their output depends on the prompt and the generated text is somewhat procedural to a certain extent. In order to understand whether large language models truly understand human needs, we construct LecSumm, in which we recruit 200 college students to write summaries for lecture notes on ten different machine learning topics, and analyzed real world human summary needs in the dimensions of summary length structure, modality and content depth. We further evaluate fine-tuned and prompt-based language models on LecSumm and show that the commercial GPT models showed better performance in summary coherence, fluency and relevance, but still fall shot in faithfulness and can better capture human needs even with advanced prompt design while fine-tuned models do not effectively learn human needs from the data. Our LecSumm dataset brings new challenges to both fine-tuned models and prompt-based large language models on the task of human-centered text summarization.
Paper Type: Long
Research Area: Human-Centered NLP
Research Area Keywords: Human-Centered NLP,Human factors in NLP
Contribution Types: Data resources, Data analysis
Languages Studied: English
Submission Number: 1348
Loading