Abstract: Photographic mosaics (or simply photomosaics) are images comprised of smaller, equally-sized image tiles such that when viewed from a distance, the tiled images of the mosaic collectively resemble a perceptually plausible image. In this paper, we consider the challenge of automatically generating a photomosaic from an input image. Although computer-generated photomosaicking has existed for quite some time, none have considered simultaneously exploiting colour/grayscale intensity and the structure of the input across scales, as well as image semantics. We propose a convolutional network for generating photomosaics guided by a multi-scale perceptual loss to capture colour, structure, and semantics across multiple scales. We demonstrate the effectiveness of our multi-scale perceptual loss by experimenting with producing extremely high resolution photomosaics and through the inclusion of ablation experiments that compare with a single-scale variant of the perceptual loss. We show that, overall, our approach produces visually pleasing results, providing a substantial improvement over common baselines.
Loading