SSR-A: Spatial- and Semantic-Aware Instructions and Curriculum Reinforcement for Advertisement Compliant Rectification
Keywords: Image Ad compliance Rectification, Image Edit, Reinforcement Learning, Multimodal Large Language Models
Abstract: While advertising is a cornerstone of commercial growth, it is constrained by online violation detection systems that reject non-compliant content at a million-scale daily.
Advertisers urgently require automated solutions to rectify these advertisements, especially visual ads, as manual fixing is unscalable.
Although recent safety-driven methods can achieve compliance, they typically suffer from over-editing, destroying the original commercial intent and perceptual similarity.
To address this, we present SSR-A, a framework tailored for the minimalist rectification of non-compliant image ads.
Instead of fine-tuning image editing models directly, SSR-A focuses on translating violation policies into targeted editing instructions.
We first introduce a Spatial- and Semantic-Aware Instruction Synthesis Pipeline, where MLLMs synthesize candidate instructions—incorporating spatial grounding and semantic guidance—and select the optimal instruction via multi-dimensional evaluation.
Furthermore, we align the model using Curriculum Reinforcement Learning, employing GRPO with multi-faceted rewards to progressively navigate the trade-off between compliance and visual preservation.
Extensive experiments and online A/B tests show that SSR-A significantly outperforms state-of-the-art baselines in both compliance and preservation of visual and commercial consistency.
Submission Type: Deployed
Copyright Form: pdf
Submission Number: 86
Loading