Penn & BGU BabyBERTa+ for Strict-Small BabyLM Challenge

Elior Sulem, Yahan Yang, Insup Lee, Dan Roth

26 Jan 2024OpenReview Archive Direct UploadReaders: Everyone

Abstract: The BabyLM Challenge aims at pre-training a language model on a small-scale dataset of inputs intended for children. In this work, we adapted the architecture and masking policy of BabyBERTa (Huebner et al., 2021) to solve the strict-small track of the BabyLM challenge. Our model, Penn & BGU BabyBERTa+, was pre-trained and evaluated on the three benchmarks of the BabyLM Challenge. Experimental results indicate that our model achieves higher or comparable performance in predicting 17 grammatical phenomena, compared to the RoBERTa baseline.

0 Replies