Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive ReserveDownload PDF

Published: 31 Oct 2022, Last Modified: 16 Oct 2022NeurIPS 2022 AcceptReaders: Everyone
Keywords: multitask learning, robustness, linear representation learning
Abstract: We find a surprising connection between multitask learning and robustness to neuron failures. Our experiments show that bilingual language models retain higher performance under various neuron perturbations, such as random deletions, magnitude pruning and weight noise. Our study is motivated by research in cognitive science showing that symptoms of dementia and cognitive decline appear later in bilingual speakers compared to monolingual patients with similar brain damage, a phenomenon called bilingual cognitive reserve. Our language model experiments replicate this phenomenon on bilingual GPT-2 and other models. We provide a theoretical justification of this robustness by mathematically analyzing linear representation learning and showing that multitasking creates more robust representations. We open-source our code and models in the following URL:\_robustness.
TL;DR: We show, theoretically and experimentally, that multitask learning increases robustness to structural perturbations.
Supplementary Material: pdf
17 Replies