Abstract: t We present a grammar inference system that leverages linguistic knowledge recorded in the form of annotations in
interlinear glossed text (IGT) and in a meta-grammar engineering system (the LinGO Grammar Matrix customization system) to
automatically produce machine-readable HPSG grammars. Building on prior work to handle the inference of lexical classes, stems,
aixes and position classes, and preliminary work on inferring case systems and word order, we introduce an integrated grammar
inference system called basil that covers a wide range of fundamental linguistic phenomena. System development was guided by
27 genealogically and geographically diverse languages, and we test the system’s cross-linguistic generalizability on an additional
5 held-out languages, using datasets provided by field linguists. Our system out-performs three baseline systems in increasing
coverage while limiting ambiguity and producing richer semantic representations, while also producing richer representations than
previous work in grammar inference.
0 Replies
Loading