Transformers Can Compose Skills To Solve Novel Problems Without FinetuningDownload PDF

Anonymous

17 Sept 2021 (modified: 05 May 2023)ACL ARR 2021 September Blind SubmissionReaders: Everyone
Abstract: It is possible to achieve improved prediction performance with Transformers on unseen datasets by adding disparate new training tasks to an existing multitask training regime. We demonstrate that this can be attributed to a compositional mechanism rather than memorisation. Performance on DROP, DROP-CS and ROPES datasets can be improved by over 26 percent without finetuning through application of numerical reasoning tasks, while performance on seven other question-answering datasets that would not be expected to be improved remains essentially unchanged. By filtering our evaluation datasets to only those samples that have no answer overlap to similar training samples, and then further restricting to those samples which have the least semantic similarity with the training set, we show that improved performance after adding numerical reasoning tasks was not attributable to direct lookup. Our code and filtered datasets are available at https://github.com/anonymised.
0 Replies

Loading