Chunk Attention-based Learning: Obtain Interpretability and Boost Performance in Content moderation

ACL ARR 2025 February Submission6016 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The advent of ChatGPT and DeepSeek has led to a widespread outbreak of content generation over the internet, amplifying the demand for reliable and scalable moderation solutions to monitor the sheer volume of generated content. Current approaches that rely on Deep Neural Networks(DNNs), fail to meet user expectations in transparency and reliability. Additionally, it is a common issue in which safe content is blocked or harm content is not. Rule-based approaches provide interpretability, but they are limited in scalability and fail to meet the dynamic moderation needs. In this paper, we present CAL (Chunk Attention-based Learning to Obtain Interpretability and Boost Performance in Content Moderation): a novel approach that simultaneously provide interpretability and enhance classification performance in content moderation. Experiments on 8+ golden multilingual datasets show that CAL outperforms traditional state-of-the-art approaches in interpretability and significantly improves the F1 score in text classification. Moreover, it achieves consistent gains across three different backbone models and three distinct taxonomy classification tasks. Furthermore, we validate CAL's practical scalability through seamless integration into a production-scale model, where it achieves millisecond latency while processing 3.5 billion daily requests.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: Information Extraction, Information Retrieval and Text Mining, Interpretability and Analysis of Models for NLP, NLP Application, Machine Learning for NLP, Syntax: Tagging, Chunking and Parsing
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English, Japanese, German, Spanish, French, Portuguese, Italian, Chinese
Submission Number: 6016
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview