Old BERT, New Tricks: Artificial Language Learning for Pre-Trained Language ModelsDownload PDF

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone
Abstract: We extend the artificial language learning experimental paradigm from psycholinguistics and apply it to pre-trained language models -- specifically, BERT (Devlin et al., 2019). We treat a pretrained model as a subject in an artificial language learning experimental setting: in order to learn the relation between two linguistic properties $A$ and $B$, we introduce a set of new, non-existent, linguistic items, give the model information about their variation along property $A$, then measure to what extent the model learns property $B$ for these items as a result of training. We show this method at work for degree modifiers (expressions like {\it slightly}, {\it very}, {\it rather}, {\it extremely}) and test the hypothesis that the degree expressed by the modifier (low, medium or high degree) is related to its sensitivity to sentence polarity (whether it shows preference for affirmative or negative sentences or neither). Our experimental results are compatible with existing linguistic observations that relate degree semantics to polarity-sensitivity, including the main one: low degree semantics leads to positive polarity sensitivity (that is, to preference towards affirmative contexts).
Paper Type: long
0 Replies

Loading