Abstract: To enhance human-computer interaction in naturalistic environments, a computing system could benefit
from predicting where a user will direct their visual attention, which would allow it to adapt its behaviour
accordingly. We investigated whether future visual attention could be predicted from past eye-gaze dynamics
in a simulated meeting in virtual reality. From recorded eye movements, we extracted gaze samples across
objects and people, which significantly reduced the dimensionality of the input and output space of the model
compared to a coordinate-based approach and enabled us to train predictive time-series models on long (16min)
videos with low computational costs. Compared to baseline and classical autoregressive models, a recurrent
neural network model improved performance in future gaze prediction by 64%. Using a self-supervised
approach, these initial results suggest that there is structure in users’ gaze dynamics and that predictive models
could be used to enable human-centric adaptive interfaces.
0 Replies
Loading