Euphemism Identification via Feature Fusion and Individualization

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24 OralEveryoneRevisionsBibTeX
Keywords: Euphemism, Euphemism Identification, Social Network Security, Feature Fusion, Feature Individualization
Abstract: Euphemisms are indirect words to convey sensitive or harsh concepts. For instance, "ice" serves as a euphemism for the target keyword "methamphetamine" in illicit transactions. Euphemisms are widely used on social media and darknet marketplaces to evade moderation and supervision. Thus,euphemism identification which aims to map the euphemism to its secret meaning (target keyword) is a crucial task in ensuring social network security. However, this task poses significant challenges, including resource limitations due to the unavailable of annotated datasets and linguistic challenges arising from subtle differences in meaning between target keywords. Existing methods have employed self-supervised schemes to automatically construct labeled training data, addressing the resource limitations. Yet, these methods rely on static embedding methods that fail to distinguish between literal and euphemistic senses, leading to confusion between target keywords with similar meanings. In addition, we observe that different euphemisms in similar contexts confuse the identification results. To overcome these obstacles, we propose a feature fusion and individualization (FFI) method for euphemism identification. First, we reformulate the task as a cloze task, making it more feasible. Next, we develop a feature fusion module to capture both dynamic global and static local features, enhancing discrimination between different euphemisms in similar contexts. Additionally, we employ a feature individualization module to ensure each target keyword has a unique feature representation by projecting features into their orthogonal space. As a result, FFI can effectively identify subtle semantic differences between similar euphemisms that refer to target keywords with similar meanings. Experimental results demonstrate that our method outperforms state-of-the-art methods and large language models (GPT3.5, Llama2, mPLUG-Owl, etc.), providing robust support for its effectiveness.
Track: Social Networks, Social Media, and Society
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: Yes
Submission Number: 670
Loading