$L^3B$ – Lies Lie in Linguistic Behavior: Learning Verbal Indicators for Content Veracity Classification
Abstract: Unverified content poses significant challenges by disrupting content veracity and integrity, thereby making effective content classification approaches crucial. Currently, content veracity classification methods primarily use supervised machine learning models, which, despite high accuracy, lack generalizability due to heavy reliance on raw content data. To address this issue, we propose a behavior-aware classification model (L$^3$B) leveraging latent linguistic behavior and external social context to extract contextually grounded features, reducing reliance on content data and sensitivity to data biases. First, we extract the verbal features from news content as linguistic behavior features and capture nuanced behavior indicators of content veracity. Then, a knowledge-based linking scheme is designed to incorporate social context, aligning extracted verbs with those derived from linked social context using semantic similarity. Finally, we feed the textual, behavioral, and contextual features into a Transformer-based classifier to fuse these features and then classify the content veracity (i.e., high or low veracity). Experimental results on public datasets demonstrate that our model outperforms most advanced classification approaches and has improved generalizability across diverse datasets, highlighting the effectiveness and robustness of our proposed model.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: human behavior analysis; misinformation detection and analysis; sociolinguistics; NLP tools for social analysis; quantitative analyses of news and/or social media
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English
Submission Number: 5324
Loading