Keywords: super-resolution, dataset construction, synthetic training
Abstract: Existing image super-resolution (SR) datasets predominantly rely on web-scraped natural images---photographs of real-world scenes---due to their abundance online.
However, this reliance hinders SR performance in specialized domains such as artwork, which comprises artificially created visuals like posters and book covers that incorporate text.
This limitation arises from the difficulty of obtaining a sufficient number of uncompressed, high-resolution artwork images from the web, as such content is scarce and often subject to copyright restrictions.
To address this issue, we propose a synthetic dataset construction pipeline that leverages advanced text-to-image (T2I) diffusion models to generate high-quality artwork images.
Using this pipeline, we construct Generated Artwork dataset for image Super-Resolution (GASR), a dataset specifically tailored for SR on artwork images. Although GASR-DF2K contains only 16\% as many images as LSDIR, a widely used large-scale SR dataset, it consistently outperforms it on the artwork-centric benchmark Manga109. These results demonstrate the effectiveness of tailored synthetic data in bridging the domain gap and substantially improving SR performance on artwork.
Submission Number: 20
Loading