MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations

MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations

ACL ARR 2025 May Submission5261 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Hate speech detection on Chinese social media platforms poses distinct challenges, particularly due to the widespread use of cloaking techniques designed to evade conventional text-based detection systems. Although large language models (LLMs) have recently improved hate speech detection capabilities, the majority of existing work has concentrated on English-language datasets, with limited attention given to multimodal strategies in the Chinese context. In this study, we propose MMBERT, a novel BERT-based multimodal framework that integrates textual, speech, and visual modalities through a Mixture-of-Experts (MoE) architecture. To address the instability associated with directly integrating MoE into BERT-based models, we develop a progressive three-stage training paradigm. MMBERT incorporates modality-specific experts, a shared self-attention mechanism, and a router-based expert allocation strategy to enhance robustness against adversarial perturbations. Empirical evaluations on multiple Chinese hate speech datasets demonstrate that MMBERT substantially outperforms both fine-tuned BERT-based encoder models and prompt-based in-context learning with LLMs.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: multimodal applications,hate speech detection,multimodality,spoken language understanding,NLP tool for social analysis

Contribution Types: Model analysis & interpretability

Languages Studied: Chinese

Submission Number: 5261

Loading