The Dark Side of the Language: Syntax-Based Neural Networks Rivaling Transformers in Definitely Unseen Sentences

Published: 2023, Last Modified: 18 Jun 2024WI/IAT 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Syntax-based methods have been largely used as key components of Natural Language Processing systems for solving a variety of tasks. Yet, pre-trained Transformers are challenging all these pre-existing methods and even humans in nearly all tasks. The massive datasets used for pre-training seem to be the key to their success on existing tasks. In this paper, we show that syntax-based neural networks rival Transformers models on tasks over definitely unseen sentences even after fine-tuning and domain adaptation. Experiments on tasks over definitely unseen sentences, provided by classification tasks over a DarkN et corpus, show that syntactic and lexical neural networks perform on par with pre-trained Transformers even after fine-tuning and domain adaptation. Only after what we call extreme domain adaptation, that is, allowing BERT to retraining on the test set with the masked language model task, pre-trained Transformers reach their standard high results. Hence, in normal conditions where sentences are really unseen, syntax-based models are a viable alternative that is more transparent and has fewer parameters than transformer-based approaches.
Loading