Machine learning in computational literary studies

Published: 01 Jan 2023, Last Modified: 18 Jun 2024it Inf. Technol. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this article, we provide an overview of machine learning as it is applied in computational literary studies, the field of computational analysis of literary texts and literature related phenomena. We survey a number of scientific publications for the machine learning methodology the scholars used and explain concepts of machine learning and natural language processing while discussing our findings. We establish that besides transformer-based language models, researchers still make frequent use of more traditional, feature-based machine learning approaches; possible reasons for this are to be found in the challenging application of modern methods to the literature domain and in the more transparent nature of traditional approaches. We shed light on how machine learning-based approaches are integrated into a research process, which often proceeds primarily from the non-quantitative, interpretative approaches of non-digital literary studies. Finally, we conclude that the application of large language models in the computational literary studies domain may simplify the application of machine learning methodology going forward, if adequate approaches for the analysis of literary texts are found.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview