Keywords: VLMs; artwork; cultural heritage; emotions; symbols; art history
TL;DR: We investigate how well current VLMs can interpret (different aspects of) emotions in artworks
Abstract: Emotions are a fundamental aspect of artistic expression. Due to
their abstract nature, there is a broad spectrum of emotion
realization in artworks. These are subject to historical change and
their analysis requires expertise in art history.
In this article, we investigate which aspects of emotional
expression can be detected by current (2025) vision language models
(VLMs). We present a case study of three VLMs (Llava-Llama and two
Qwen models) in which we ask these models four sets of questions of
increasing complexity about artworks (general content, emotional
content, expression of emotions, and emotion symbols) and carry out
a qualitative expert evaluation. We find that
the VLMs recognize the content of the images surprisingly well and
often also which emotions they depict and how they are expressed.
The models perform best for concrete images but fail for highly
abstract or highly symbolic images. Reliable recognition of symbols
remains fundamentally difficult. Furthermore, the models continue to
exhibit the well-known LLM weakness of providing inconsistent
answers to related questions.
Submission Number: 1
Loading