Abstract: Transformer models like BERT have shown great success in various natural language processing tasks, but they often require a significant amount of time and computational resources to train and deploy due to their large size. In this analysis, we are comparing the accuracy of four small BERT transformers as a solution to reduce the computational requirements while maintaining similar levels of performance. Additionally, we are examining the impact of using an MLP decoder, which seems to have a positive effect on the accuracy for the medium bert. We evaluate our results on a new benchmark we call Sequence labellIng evaLuatIon benChmark for spoken laNguagE benchmark (SILICONE).
0 Replies
Loading