Abstract: Understanding how people comprehend visual narratives (including picture stories, comics, and
film) requires the combination of traditionally separate theories that span the initial sensory and
perceptual processing of complex visual scenes, the perception of events over time, and compre-
hension of narratives. Existing piecemeal approaches fail to capture the interplay between these
levels of processing. Here, we propose the Scene Perception & Event Comprehension Theory
(SPECT), as applied to visual narratives, which distinguishes between front-end and back-end cog-
nitive processes. Front-end processes occur during single eye fixations and are comprised of atten-
tional selection and information extraction. Back-end processes occur across multiple fixations and
support the construction of event models, which reflect understanding of what is happening now
in a narrative (stored in working memory) and over the course of the entire narrative (stored in
long-term episodic memory). We describe relationships between front- and back-end processes,
and medium-specific differences that likely produce variation in front-end and back-end processes
across media (e.g., picture stories vs. film). We describe several novel research questions derived
from SPECT that we have explored. By addressing these questions, we provide greater insight into
how attention, information extraction, and event model processes are dynamically coordinated to
perceive and understand complex naturalistic visual events in narratives and the real world.how attention, information extraction, and event model processes are dynamically coordinated to perceive and understand complex naturalistic visual events in narratives and the real world.
Loading