Abstract: Stereotypical character roles—also known as
archetypes or dramatis personae—play an im-
portant function in narratives: they facilitate
efficient communication with bundles of de-
fault characteristics and associations and ease
understanding of those characters’ roles in the
overall narrative. We present a fully unsuper-
vised k-means clustering approach for learn-
ing stereotypical roles given only structural
plot information. We demonstrate the tech-
nique on Vladimir Propp’s structural theory
of Russian folktales (captured in the extended
ProppLearner corpus, with 46 tales), showing
that our approach can induce six out of seven
of Propp’s dramatis personae with F1 mea-
sures of up to 0.70 (0.58 average), with an
additional category for minor characters. We
have explored various feature sets and varia-
tions of a cluster evaluation method. The best-
performing feature set comprises plot func-
tions, unigrams, tf-idf weights, and embed-
dings over coreference chain heads. Roles that
are mentioned more often (Hero, Villain), or
have clearly distinct plot patterns (Princess)
are more strongly differentiated than less fre-
quent or distinct roles (Dispatcher, Helper,
Donor). Detailed error analysis suggests that
the quality of the coreference chain and plot
functions annotations are critical for this task.
We provide all our data and code for repro-
ducibility
0 Replies
Loading