Noise, Novels, Numbers. A Framework for Detecting and Categorizing Noise in Danish and Norwegian Literature
Abstract: We present a framework for detecting and categorizing noise in literary texts, demonstrated through its application to Danish and Norwegian literature from the late 19th century. Noise, understood as ``aberrant sonic behaviour,'' is not only an auditory phenomenon but also a cultural construct tied to the processes of civilization and urbanization. By leveraging topic modeling techniques and fine-tuned BERT-based language models trained on Danish and Norwegian texts, we analyze a corpus of over 800 novels to extract and examine noise-related topics. We identify and track the prevalence of noise in these texts, offering insights into the literary perceptions of noise during the Scandinavian ``Modern Breakthrough'' period (1870-1899). Our contributions include the development of a comprehensive dataset annotated for noise-related segments and their categorization into human-made, non-human-made, and musical noises. This study illustrates the framework's potential for enhancing the understanding of the relationship between noise and its literary representations, providing a deeper appreciation of the auditory elements that enrich literary works.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: NLP datasets, evaluation, historical NLP
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: Danish, Norwegian
Submission Number: 1254
Loading