Abstract: Syntactic language models (SLMs) enhance Transformers by incorporating syntactic biases through the modeling of linearized syntactic parse trees alongside surface sentences. This paper focuses on compositional SLMs that are based on constituency parse trees and contain explicit bottom-up composition of constituent representations. We identify key aspects of design choices in existing compositional SLMs and propose a unified framework encompassing both existing models and novel variants. We conduct a comprehensive empirical evaluation of all the variants in our framework across language modeling, syntactic generalization, summarization, and inference efficiency. Based on the experimental results, we make multiple recommendations on the design of compositional SLMs. Our code will be publicly available upon acceptance of the paper.
Paper Type: Long
Research Area: Syntax: Tagging, Chunking and Parsing
Research Area Keywords: constituency parsing, grammar and knowledge-based approaches
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 3849
Loading