$\mathcal{R}^3$: Advertisement Compliance $\mathcal{R}$ectification via Group-$\mathcal{R}$elative Experience Extractor and Curriculum $\mathcal{R}$einforcement

Published: 18 Apr 2026, Last Modified: 24 Apr 2026ACL 2026 Industry Track PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Advertisement Compliance; Advertising Creative Rectification; Text Rewriting; Reinforcement Learning; Large Language Model
Abstract: Rigorous content moderation is crucial for online advertising but leads to millions of daily rejections. This scale renders manual rectification infeasible, particularly for video advertisements. However, existing safety-driven methods often suffer from aggressive over-editing, which compromises the advertiser's original semantic intent merely to satisfy compliance. In this work, we target the rectification of textual violations in video ads, covering both speech transcripts and on-screen text. We propose $\mathcal{R}^3$, a novel framework designed to harmonize compliance with original semantic intent preservation. Our approach integrates three key innovations: (1) an experience-driven data synthesis framework that bootstraps high-quality supervision via group-**R**elative compliance experience extractor; (2) a curriculum **R**einforcement learning strategy with hierarchical rewards designed to enforce compliance while maximizing semantic consistency; and (3) a comprehensive video **R**ectification framework seamlessly integrating text recognition, rewriting, and re-rendering for industrial deployment. Extensive experiments on industrial datasets and online A/B testing demonstrate that $\mathcal{R}^3$ significantly outperforms state-of-the-art baselines, achieving an optimal trade-off between violation rectification and intent preservation.
Submission Type: Deployed
Copyright Form: pdf
Submission Number: 87
Loading