MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations
Abstract: Hate speech detection on Chinese social media platforms poses distinct challenges, particularly due to the widespread use of cloaking techniques designed to evade conventional text-based detection systems. Although large language models (LLMs) have recently improved hate speech detection capabilities, the majority of existing work has concentrated on English-language datasets, with limited attention given to multimodal strategies in the Chinese context. In this study, we propose MMBERT, a novel BERT-based multimodal framework that integrates textual, speech, and visual modalities through a Mixture-of-Experts (MoE) architecture. To address the instability associated with directly integrating MoE into BERT-based models, we develop a progressive three-stage training paradigm. MMBERT incorporates modality-specific experts, a shared self-attention mechanism, and a router-based expert allocation strategy to enhance robustness against adversarial perturbations. Empirical evaluations on multiple Chinese hate speech datasets demonstrate that MMBERT substantially outperforms both fine-tuned BERT-based encoder models and prompt-based in-context learning with LLMs.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: multimodal applications,hate speech detection,multimodality,spoken language understanding,NLP tool for social analysis
Contribution Types: Model analysis & interpretability
Languages Studied: Chinese
Submission Number: 5261
Loading