GiLT: Augmenting Transformer Language Models with Dependency Graphs

GiLT: Augmenting Transformer Language Models with Dependency Graphs

ACL ARR 2025 May Submission7847 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Augmenting Transformers with linguistic structures effectively enhances the syntactic generalization performance of language models. Previous work in this direction focuses on syntactic tree structures of languages, in particular constituency tree structures. We propose Graph-Infused Layers Transformer Language Model (GiLT) which leverages dependency graphs for augmenting Transformers language model. Unlike most previous work, GiLT dose not insert extra structural tokens in language modeling; instead, it injects structural information into language modeling by modulating attention weights in the Transformer with features extracted from the dependency graph that is incrementally constructed along with token prediction. In our experiments, GiLT with semantic dependency graphs achieves better syntactic generalization while maintaining competitive perplexity in comparison with Transformer language model baselines. In addition, GiLT can be finetuned from a pretrained language model to achieve improved downstream task performance. Our code is available for release.

Paper Type: Long

Research Area: Syntax: Tagging, Chunking and Parsing

Research Area Keywords: dependency parsing, grammar and knowledge-based approaches, semantic parsing

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 7847

Loading