Abstract: A vast amount of research has been conducted about deep learning and its applications in Computer Vision (CV). However, the application to project an object instance onto a real image or video in a semantically coherent manner, such that the projected object is indistinguishable from a real object, is only in its infancy. In our research, we aim to evaluate a generative model which is able to generate and place an object instance onto an image in a semantically coherent manner using a where and a what module; both of these employ Generative Adversarial Networks (GANs). Furthermore, we improve the shape generation by adding a classifier before the training data is used. Finally, we intend to increase the training stability by using an alternative training methodology and adjusting the Jenson-Shannon divergence to the Wasserstein distance. The implication of this work is the improved stability of an existing generative model, which inserts instances onto an image. Furthermore, we were also able to improve its performance.
0 Replies
Loading