Keywords: constrained decoding; context-free parsing; GPU acceleration
Abstract: Constrained decoding systems are often built on context-free parsers intended for programming languages.
These parsers either degrade into $O(n^3)$ time-complexity or fail entirely if the grammar is not carefully engineered to keep properties such as determinism and non-ambiguity.
There is thus a need to design parsers that efficiently handle non-determinism and ambiguity, while simultaneously being incremental so that they can be coupled with the token-based predictions of large language models.
Inspired by prior work, we derive an incremental Valiant prefix recognizer, which still has $O(n^3)$ complexity but allows for acceleration with only a fraction of GPU resources (stream multiprocessors).
Our parser shows robust efficiency in complex context-free grammars while other parsers crash or degrade.
At the same time, we remain empirically competitive in restrictive grammar classes such as LALR.
Paper Type: Long
Research Area: Hierarchical Structure Prediction, Syntax, and Parsing
Research Area Keywords: parsing algorithms (symbolic, theoretical results);hierarchical structure prediction;
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models, Theory
Languages Studied: English
Submission Number: 6934
Loading