Abstract: Attention guides our gaze to fixate the proper location of the scene and holds it in that location for the deserved amount of time given current processing demands,
before shifting to the next one. As such, gaze deployment crucially is a temporal process. Existing computational models have made significant strides in predicting
spatial aspects of observer’s visual scanpaths (where to
look), while often putting on the background the temporal facet of attention dynamics (when). In this paper we
present TPP-Gaze, a novel and principled approach to
model scanpath dynamics based on Neural Temporal Point
Process (TPP), that jointly learns the temporal dynamics of
fixations position and duration, integrating deep learning
methodologies with point process theory. We conduct extensive experiments across five publicly available datasets.
Our results show the overall superior performance of the
proposed model compared to state-of-the-art approaches.
Source code and trained models are publicly available at:
https://github.com/phuselab/tppgaze.
Loading