InnoAds: A Foundation Dataset and Benchmark for Advertising Visual Effects in Video Generation

11 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Advertising Visual Effects, Video Generation, Dataset and Benchmark
TL;DR: We propose InnoAds-32K, a 32K image–text–video dataset, and InnoAds-Bench, a six-metric benchmark for advertising visual effects. Evaluations show SOTA video generation models fall short on Ad-VFX, setting a new research foundation.
Abstract: Advertising visual effects (Ad-VFX) are the significant visual elements of advertising videos that combine dynamic product presentation with accompanying descriptive text. However, research on Ad-VFX has been hindered by the lack of dedicated datasets and standardized evaluation protocols, as it is still an emerging domain. To address this issue, we introduce InnoAds-32K, a foundation dataset of over 32,000 curated image–text–video triples tailored for advertising scenarios. Furthermore, we propose InnoAds-Bench, a comprehensive benchmark that spans six evaluation dimensions: visual quality and text relevance as general metrics, and motion, product consistency, text stability, and creative rationality as advertising-specific metrics. Based on this suite, we systematically evaluate state-of-the-art video generation models, revealing substantial limitations in their ability to satisfy advertising requirements. In summary, InnoAds-32K and InnoAds-Bench provide the first standardized foundation for Ad-VFX video generation, paving the way for future research in advertising scenarios.
Primary Area: datasets and benchmarks
Submission Number: 3876
Loading