Abstract: This research delves into the intricate connection between self-attention mechanisms in large-scale pre-trained language models, like BERT, and human gaze patterns, with the aim of harnessing gaze information to enhance the performance of natural language processing (NLP) models. We analyze the correlation between BERT attention and five distinct gaze signals based on the Spearman correlation, discovering that neither all attention layers nor all gaze signals accurately capture word importance. Building on this insight, we propose gaze-infused BERT, a novel model that integrates gaze signals into BERT for performance enhancement. Specifically, we first utilize a gaze prediction model based on RoBERTa to estimate five gaze signals, our lightweight model utilizes the entropy weight method (EWM) to generate a comprehensive gaze representation by combining five diverse gaze signals. This representation is then embedded into the transformer encoder while performing the self-attention between the input sequence, enriching contextual information and boosting performance. Extensive evaluations on the 15 datasets demonstrate that gaze-infused BERT consistently outperforms baseline models across various NLP tasks, highlighting the potential of integrating human gaze signals into pre-trained language models.
Loading