AIGuard: A Benchmark and Lightweight Detection for E-commerce AIGC Risks

AIGuard: A Benchmark and Lightweight Detection for E-commerce AIGC Risks

ACL ARR 2025 February Submission3120 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: concerns about harmful outputs, such as misinformation and malicious misuse. Existing detection methods face two key limitations: (1) lacking real-world AIGC scenarios and corresponding risk datasets, and (2) both traditional and multimodal large language models~(MLLMs) struggle to detect risks in AIGC. Towards this end, we introduce **AIGuard**, the first benchmark for AIGC risk detection in real-world e-commerce. It includes 253,420 image-text pairs (*i.e.,* the risk content and risk description) across four critical categories: *abnormal body*, *violating physical laws*, *disharmonious background*, and *illegal message*. To effectively detect these risks, we propose distilling text annotations into dense soft prompts and identifying risk content through image soft prompt matching during inference. Experiments on the benchmark show that this method achieves a 9.68% higher recall than leading multimodal models while using only 25% of the training resources and improving inference speed by 37.8 times. For further research, our benchmark and code are available at https://anonymous.4open.science/r/aigc-dataset-anonymous.

Paper Type: Long

Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond

Research Area Keywords: image text matching,cross-modal information extraction

Contribution Types: Data resources

Languages Studied: English,Chinese

Submission Number: 3120

Loading