- Keywords: vae, ner, tagging, crf, nlp, semi-supervised learning
- TL;DR: We embed a CRF in a VAE of tokens and NER tags for semi-supervised learning and show improvements in low-resource settings.
- Abstract: We investigate methods for semi-supervised learning (SSL) of a neural linear-chain conditional random field (CRF) for Named Entity Recognition (NER) by treating the tagger as the amortized variational posterior in a generative model of text given tags. We first illustrate how to incorporate a CRF in a VAE, enabling end-to-end training on semi-supervised data. We then investigate a series of increasingly complex deep generative models of tokens given tags enabled by end-to-end optimization, comparing the proposed models against supervised and strong CRF SSL baselines on the Ontonotes5 NER dataset. We find that our best proposed model consistently improves performance by $\approx 1\%$ F1 in low- and moderate-resource regimes and easily addresses degenerate model behavior in a more difficult, partially supervised setting.