Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLUDownload PDF

Anonymous

17 Dec 2021 (modified: 05 May 2023)ACL ARR 2021 December Blind SubmissionReaders: Everyone
Abstract: Curriculum Learning (CL) is a technique of training models via ranking examples in a typically increasing difficulty trend with the aim of accelerating convergence and improving generalisability. However, current approaches for Natural Language Understanding (NLU) tasks use CL to improve in-domain model performance often via metrics that are detached from the model one aims to improve. In this work, instead, we employ CL for NLU by taking advantage of training dynamics as difficulty metrics, i.e. statistics that measure the behavior of the model at hand on data instances during training. In addition, we propose two modifications of existing CL schedulers based on these statistics. Differently from existing works, we focus on evaluating models on out-of-distribution data as well as languages other than English via zero-shot cross-lingual transfer. We show across four XNLU tasks that CL with training dynamics in both monolingual and cross-lingual settings can achieve significant speedups up to 58%. We also find that performance can be improved on challenging tasks, with OOD generalisation up by 8\% and zero-shot cross-lingual transfer up by 1%. Overall, experiments indicate that training dynamics can lead to better performing models and smoother training compared to other difficulty metrics.
Paper Type: long
Consent To Share Data: yes
0 Replies

Loading