Abstract: Offensive speech detection has been a prominent research topic in natural language processing (NLP). However, the development of Chinese OSD is constrained by the lack of sufficient benchmark datasets. Moreover, Chinese OSD faces challenges such as ambiguity, context dependence, and particularly the identification of implicit offensive speech. To address these challenges, we introduce a fine-grained labeling system for 10 categories of implicit offensive speech, grounded in linguistic principles, and present SinOffen, a comprehensive real-world Chinese offensive speech dataset constructed based on this system. We benchmark a range of mainstream pretrained language models (PLMs) and generative large language models (LLMs) on SinOffen, with a particular focus on performance gaps in implicit categories. Through detailed empirical analysis, we uncover key limitations in current models’ ability to handle subtle, context-dependent offense. Our findings underscore the urgent need for more sophisticated detection approaches tailored to the evolving and often covert nature of Chinese implicit speech.
External IDs:doi:10.1109/tcss.2026.3668363
Loading