CourtPressGER

CourtPressGER

ACL ARR 2025 May Submission6652 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Official court press releases from Germany’s highest courts are vital for bridging complex judicial rulings and the public. Prior efforts on German legal text summarization in NLP emphasize technical headnotes, often ignoring the need for citizen oriented communication. We introduce CourtPressGER, a 6.4k triple dataset of rulings, their human-drafted press releases, and synthetic contextual generation prompts for LLMs to generate comparable press releases. The resulting benchmark dataset is intended to train and evaluate LLMs in generating accurate, more readable summaries from long judicial texts. We benchmark a set of small and large LLMs on the task and evaluate model outputs via reference-based metrics, factual-consistency checks, and an LLM-as-judge approach that approximates expert review. We further conduct qualitative expert analysis and ranking. Results show that large LLMs produce near-human-quality drafts and only marginally lose performance when applied hierarchically. Smaller models require a hierarchical setup to be able to summarize long judgments, and achieve a range of scores. All models struggle with factual consistency, and the human drafted press release is consistently ranked highest.

Paper Type: Short

Research Area: Summarization

Research Area Keywords: extractive summarisation, abstractive summarisation, multimodal summarization, long-form summarization, evaluation, factuality

Contribution Types: Data resources, Data analysis

Languages Studied: German

Submission Number: 6652

Loading