Can Transformers Be One-To-Many Estimators?

Anonymous

Can Transformers Be One-To-Many Estimators?

Anonymous

16 Dec 2022 (modified: 05 May 2023)ACL ARR 2022 December Blind SubmissionReaders: Everyone

Abstract: Many natural language generation tasks often have more than one acceptable output for a given input. However, models are typically trained as one-to-one function estimators, thereby protecting them from the complexity of learning multiple possible outputs per input sample. In this work, we study the one-to-many environment through single-input multi-output training and evaluation regimens (SIMOR). Specifically, we show that training natural language generation models on datasets with multiple valid outputs helps models perform better than in the simplified setup commonly used in the literature. Using SIMOR, our experiments show that on the CFQ dataset, models learn to emit valid SPARQL programs 10x faster and with greater performance. Moreover, our experiments demonstrate gains in BLEU and TER metrics on low-resource datasets extracted from the WMT16 de-en benchmark.

Paper Type: short

Research Area: Efficient Methods for NLP

0 Replies

Loading