GeNeRTe: Generating Neural Representations from Text for Classification.

ACL ARR 2024 April Submission710 Authors

16 Apr 2024 (modified: 20 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Advancements in language modelling over the last decade have significantly improved downstream tasks such as automated text classification. However, deploying such systems requires high computational resources and extensive training data. Human adults can effortlessly perform such tasks with minimal computational overhead and training data which prompts research into leveraging neurocognitive signals such as Electroencephalography (EEG). We compare a Large Language Models (LLMs) and EEG features captured during natural reading for text classification. Additionally, we introduce GeNeRTe, a novel state-of-the-art synthetic EEG generative model. Using only a limited amount of data, GeNeRTe learns to produce synthetic EEG features for a sentence through a neural regressor that resolves the relationship between embeddings for a sentence and its natural EEG. From our experiments, we show that GeNeRTe can effectively synthesize EEG features for unseen test sentences with just 236 sentence-EEG training pairs. Furthermore, using synthetic EEG features significantly improves text classification performance and reduces computation time. Our results emphasize the potential of synthetic EEG features, providing a viable path to create a new type of physiological embedding with lower computing requirements and improved model performance in practical applications.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: generative models; efficient models; analysis; model architectures; cognitive modeling;
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis
Languages Studied: english
Submission Number: 710
Loading