A Corpus of Czech Essays from the Turn of the 1900s

Petr Pořízka

Published: 30 Dec 2021, Last Modified: 24 Mar 2024Jazykovedný časopis, 72/2, 2021, 618–630EveryoneCC BY-ND 4.0

Abstract: The article presents a new corpus of Czech literary essays covering approximately fifty years from 1890 to 1940. Along with the characterisation of the corpus and its annotation, the paper focuses on the TXM corpus tool: In the second part of the study, we use selected texts to conduct an analysis of seven various authors through multidimensional cluster analysis, factorial correspondence analysis and a specificity score. The main parameter of the analyses was usage of parts of speech in texts by individual authors and a partial analysis of grammatical number. At present, the texts cover various topics (music, visual arts, theatre, literature, etc.).