Keywords: Advertising Visual Effects, Video Generation, Dataset and Benchmark
TL;DR: We propose InnoAds-32K, a 32K image–text–video dataset, and InnoAds-Bench, a six-metric benchmark for advertising visual effects. Evaluations show SOTA video generation models fall short on Ad-VFX, setting a new research foundation.
Abstract: Advertising visual effects (Ad-VFX) are the significant visual elements of advertising videos that combine dynamic product presentation with accompanying descriptive text.
However, research on Ad-VFX has been hindered by the lack of dedicated datasets and standardized evaluation protocols, as it is still an emerging domain.
To address this issue, we introduce InnoAds-32K, a foundation dataset of over 32,000 curated image–text–video triples tailored for advertising scenarios.
Furthermore, we propose InnoAds-Bench, a comprehensive benchmark that spans six evaluation dimensions: visual quality and text relevance as general metrics, and motion, product consistency, text stability, and creative rationality as advertising-specific metrics.
Based on this suite, we systematically evaluate state-of-the-art video generation models, revealing substantial limitations in their ability to satisfy advertising requirements.
In summary, InnoAds-32K and InnoAds-Bench provide the first standardized foundation for Ad-VFX video generation, paving the way for future research in advertising scenarios.
Primary Area: datasets and benchmarks
Submission Number: 3876
Loading