Reading Models’ Self-Defense: Narratology as Legibility Instrument for Cultural AI Evaluation

Published: 01 Jun 2026, Last Modified: 01 Jun 2026Culture x AI 2026 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Narratology, Cultural AI, Legibility, Measurability, LLM Evaluation, Self-Defense Patterns
TL;DR: This paper argues that the interpretive commitments of frontier LLMs should be made legible rather than reduced to scores, demonstrated through a narratology-based apparatus that surfaces tension-defense.
Abstract: This paper makes a case for approaching the interpretive commitments of AI systems as something to be made legible rather than reduced to measurable scores or left to be inherited unexamined. We asked six large language models (LLMs) to select 20 constraints from a narratology-based library of 200, then to assess the compatibility of their own selection, in order to see how each model handles narrative design evaluation. We found that models commonly respond to the task of identifying interference by denying it preemptively or immediately redeeming it as productive friction. This pattern of tension-defense remains consistent within models and recurs across generations (established across ~2970 audit responses), with some rhetorical particularities. We propose that humanistic disciplines can provide the analytic apparatus through which the development of cultural AI extends beyond measurability.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 70
Loading