Design and Implementation of German Legal Decision Corpora

Stefanie Urchs; Jelena Mitrovic; Michael Granitzer

Design and Implementation of German Legal Decision Corpora

Stefanie Urchs, Jelena Mitrovic, Michael Granitzer

Published: 01 Jan 2021, Last Modified: 30 Jul 2025ICAART (2) 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Law professionals are wordsmiths, their main tool is language. Therefore, the field of law produces a vast amount of written text. These texts have to be analysed, summarised, and used in the creation of new text, which is a task that reaches the limits of what is humanly possible. However, it is possible to automate this analysis by using Natural Language Processing techniques. To perform these techniques (annotated) text corpora are required. Unfortunately, publicly available (annotated) legal text corpora are rare. Even scarcer is the availability of (annotated) German legal text corpora. To meet this need for publicly available German legal text corpora this paper presents two German legal text corpora. The first corpus contains 32,748 decisions from 131 German courts, enriched with metadata. The second one is a subset of the first corpus and consists of 200 randomly chosen judgements. In these judgements a legal expert annotated the components conclusion, definition and subsumpt

Loading