Simpler neural networks prefer subregular languages

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX
Submission Type: Regular Long Paper
Submission Track: Linguistic Theories, Cognitive Modeling, and Psycholinguistics
Keywords: subregularity, formal language theory, minimum description length, inductive biases
TL;DR: LSTMs subject to $L_0$ regularization show a pronounced preference to learn subregular instead of regular patterns. LSTMs which learn subregular languages attain higher accuracy with fewer parameters.
Abstract: We apply a continuous relaxation of $L_0$ regularization (Louizos et al., 2017), which induces sparsity, to study the inductive biases of LSTMs. In particular, we are interested in the patterns of formal languages which are readily learned and expressed by LSTMs. Across a wide range of tests we find sparse LSTMs prefer subregular languages over regular languages and the strength of this preference increases as we increase the pressure for sparsity. Furthermore LSTMs which are trained on subregular languages have fewer non-zero parameters. We conjecture that this subregular bias in LSTMs is related to the cognitive bias for subregular language observed in human phonology which are both downstream of a simplicity bias in a suitable description language.
Submission Number: 4879
Loading