Enabling Low-Resource Transfer Learning across COVID-19 Corpora by Combining Event-Extraction and Co-TrainingDownload PDF

Published: 04 Jul 2020, Last Modified: 05 May 2023NLP-COVID-2020Readers: Everyone
Keywords: co-training, cross domain learning, natural language processing, covid-19, coronavirus, social science, event-extraction
TL;DR: We show gains in low-resource transfer learning across corpora using RoBERTa by apply co-training, using event-extraction as a view.
Abstract: Social-science investigations can benefit from a direct comparison of heterogenous corpora: in this work, we compare U.S. state-level COVID-19 policy announcements with policy discussions on Twitter. To perform this task, we require classifiers with high transfer accuracy to both (1) classify policy announcements and (2) classify tweets. We find that co-training using event-extraction views improves the transfer accuracy of our RoBERTa classifier by 11% above a baseline. The same improvements are not observed with a baseline classifier, for the baseline classification task or for baseline views. With only a small set of 576 COVID-19 policy announcements, hand-classified into 1 of 6 categories, our RoBERTA co-trained classifier observes a maximum transfer accuracy of .77 f1-score on a hand-validated set of tweets. This work represents the first known application of these techniques to an NLP transfer learning task and facilitates cross-corpora comparisons necessary for studies of social science phenomena.
8 Replies

Loading