Effective utilization of labeled data from related tasks using graph contrastive pretraining: application to disaster related text classification

Samujjwal Ghosh, Subhadeep Maji, Maunendra Sankar Desarkar

2022 (modified: 21 Jun 2022)SAC 2022Readers: Everyone

Abstract: Contrastive pretraining techniques for text classification has been largely studied in an unsupervised setting. However, oftentimes labeled data from related past datasets which share label semantics with current task is available. We hypothesize that using this labeled data effectively can lead to better generalization on current task. In this paper, we propose a novel way to effectively utilize labeled data from related tasks with a graph based supervised contrastive learning approach. We formulate a token-graph by extrapolating the supervised information from examples to tokens. Our experiments with 8 disaster datasets show our method outperforms baselines and also example-level contrastive learning based formulation. In addition, we show cross-domain effectiveness of our method in a zero-shot setting.

0 Replies