Letter_2_Santa.py – Tapping Big Data from the Arctic Circle

University of Eastern Finland DRDHum 2024 Conference Submission60 Authors

Published: 03 Jun 2024, Last Modified: 03 Jun 2024DRDHum 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Finland, Linguistic Data Science, Cultural Studies, Art Education, Pragmatics
TL;DR: Submission for a POSTER
Abstract: POSTER Our poster presents the results of a pilot project which aims at building the Santa Claus Letter Corpus. These letters – sent to Finland from around the world – feature text and art, mostly handwriting enriched with drawings. The senders are primarily children. The physical collection at the National Archives contains 25 shelf meters of letters. So far they have been catalogued only in bunches, according to country and year of origin. We have started to examining the collection in 2023, digitised parts of it, enriched the cataloguing metadata, run tests for quantitiative analyses, and carried out first qualitative analyses. Our original focus has been on letters which we expected to be written in either German, Finnish, Swedish, or Russian. But we found out immediately that the language diversity is higher than the sender’s poststamp suggests, e.g. letters sent in Finnish from Sweden or in English from Germany. The main results of our pilot were: 1) The documentation of workflows and data standards for digitisation, 2) Preliminary (manual) indexing according to language, artwork, and texttype, 3) Experimenting with computational methods for indexing the letters (format-, language-, and text recognition), 4) Pragmatic analysis of a subset of German-language letters (name anonymised, in press). REFERENCES name anonymised (in press) „Briefe an den Weihnachtsmann in Finnland – eine unerforschte Textsorte. Kategorisierung und textpragmatische Auswertung“
Submission Number: 60
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview