Token and part-of-speech fusion for pretraining of transformers with application in automatic cyberbullying detection

Published: 01 Jan 2025, Last Modified: 23 Jun 2025Nat. Lang. Process. J. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose a transformer pretraining method combining ELECTRA with POS labels.•We identify tokenization strategies by comparing SentencePiece and WordPiece methods.•Integrating grammar in transformers boosts NLP, particularly cyberbullying detection.
Loading