Token and part-of-speech fusion for pretraining of transformers with application in automatic cyberbullying detection
Abstract: Highlights•We propose a transformer pretraining method combining ELECTRA with POS labels.•We identify tokenization strategies by comparing SentencePiece and WordPiece methods.•Integrating grammar in transformers boosts NLP, particularly cyberbullying detection.
Loading