- Keywords: NLG, Few shot, low resource, data efficient, BART
- TL;DR: Data efficient approaches for bootstrapping NLG models
- Abstract: Natural language generation (NLG) plays an important role in task-oriented dialog systems to provide meaningful and natural responses to user's requests. However, training a NLG model that could surface production-ready quality responses usually requires a large amount of training data. In this paper, we propose two novel data-efficient approaches to bootstrap the model. We first propose a template-based approach that leverages a scenario generation framework to create full coverage of possible scenarios and their corresponding synthetic annotations. Secondly, we leverage the pretrained BART model with a bucketing method that groups scenarios based on their dialog act structures. Extensive experiments on three datasets show our approaches achieve production-quality with 10 times less labelled data than a standard NLG dataset.