Comprehensive Sentiment Analysis of Polish Book Reviews Using Large and Small Language Models

Agnieszka Karlinska, Piotr Milkowski, Paulina Czwordon-Lis, Bartlomiej Koptyra, Jan Kocon

Published: 2024, Last Modified: 20 May 2025ICDM (Workshops) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper presents a comprehensive study of sentiment analysis for Polish book reviews through the creation of a novel, manually annotated dataset and the evaluation of various language models. We introduce a detailed sentiment annotation scheme, addressing challenges encountered during the annotation process, and evaluate model performance on sentiment classification at both the sentence and document levels, as well as text type identification. The study compares specialized Polish transformer models, newly developed Polish-specific large language models (LLMs), and leading commercial LLMs, testing both fine-tuning and zero-shot approaches. Results show that fine-tuned, Polish-adapted LLMs significantly outperform both small language models (SLMs) and commercial zero-shot LLMs, underscoring the importance of domain-specific fine-tuning and language adaptation for sentiment analysis in specialized contexts like literary criticism.