Libra-V: Large Chinese-based Safeguard for Multimodal AI Content

ACL ARR 2025 May Submission2466 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Multimodal Large Language Models (MLLMs) demonstrate remarkable visual-language reasoning capabilities but present significant safety challenges, particularly in multilingual contexts. Existing guardrail systems offer limited support for Chinese content, and there exists a notable absence of specialized evaluation benchmarks. We introduce Libra-V, a comprehensive multimodal safeguard framework specifically designed for Chinese scenarios. Based on expert studies, we first establish a comprehensive safety taxonomy. This taxonomy serves as the foundation for developing a training dataset of 140,000+ annotated Chinese multimodal harmful query-response pairs and a test benchmark with dual evaluation protocols (ID/OOD) using three-category safety annotations for assessing Chinese multimodal guardrails. Our dataset encompasses multiple harm categories including legal violations, psychological harm, ethical issues, and privacy concerns, with specialized coverage of Chinese cultural and linguistic contexts. Extensive experimental results demonstrate that Libra-V substantially enhances MLLM's safety while preserving model performance on legitimate tasks, representing a meaningful advancement in developing Chinese multimodal guardrails. The dataset and model will be open-sourced soon.
Paper Type: Long
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: cross-modal information extraction
Contribution Types: Data resources
Languages Studied: Chinese
Submission Number: 2466
Loading