Abstract: The BabyLM Challenge aims at pre-training a language model on a small-scale dataset of inputs intended for children. In this work, we
adapted the architecture and masking policy of BabyBERTa (Huebner et al., 2021) to solve the strict-small track of the BabyLM challenge.
Our model, Penn & BGU BabyBERTa+, was pre-trained and evaluated on the three benchmarks of the BabyLM Challenge. Experimental results indicate that our model achieves higher or comparable performance in predicting 17 grammatical phenomena, compared to
the RoBERTa baseline.
0 Replies
Loading