- Keywords: Blind image inpainting, visual consistency, spatial normalization, generative adversarial networks
- Abstract: Reproducibility Summary In this study, we report and reproduce, on a large scale, the results of the article of a novel blind image inpainting architecture, namely VCNet, which jointly-controls the mask prediction and the blind inpainting modules. We have implemented this architecture from scratch in PyTorch, and then have conducted our experiments and blind inpainting evaluation on all datasets, as described in the article. We have achieved to reproduce the results qualitatively and quantitatively in most cases. Scope of Reproducibility In the scope of this study, we validate the qualitative and quantitative results of VCNet on robust blind image inpainting. Methodology The original study achieves robust blind image inpainting by exploiting the mask prediction network (MPN) for guiding the blind inpainting branch (RIN). The paper has been implemented from scratch in PyTorch. We have conducted the synthetic data experiments on FFHQ and Places2 datasets, and also tried to achieve some of the blind inpainting evaluation tasks, as stated in the paper. Experiments have been completed on 1x RTX 2080 Ti in 4 to 6 days for each, and do not require any other significant resources, but GPU memory. Results We have achieved to reproduce the results qualitatively and quantitatively on a large scale. Qualitatively, MPN is able to learn the corrupted areas in input images in earlier steps reported in the paper, and RIN produces visually plausible outputs after extensive hyper-parameter tuning. Moreover, we measured the quantitative performance of the reproduced model with the metrics reported in the paper. What was easy The paper is well-written. The main components of VCNet, except Probabilistic Context Normalization, are composed of the common layers in PyTorch. Therefore, the general implementation is straightforward. We have no issues with implementing the loss functions, and all hyper-parameters are precisely indicated in the paper. What was difficult Due to the lack of computational resources, we could not fit the data with the batch size reported in the paper, and thus we need to adjust the learning rates and the number of training steps according to the appropriate batch size in our settings. GAN training is quite unstable while trying to tune hyper-parameters in the case of any change. Mask smoothing procedure and the input format for RIN are the issues that we fixed after communicating with the authors. For the experiments of face-swapping, we could not achieve to reproduce the reported results by applying exactly the same technique mentioned in the paper, even after fine-tuning trials. The failure cases can be found in this paper. Communication with original authors We were in contact with the authors since the beginning of the challenge. They swiftly answered our questions, and clarified some important missing points in training scheme and the architecture.
- Paper Url: https://openreview.net/forum?id=B8II-BEdx9¬eId=UX1Qpc0qvZQ&referrer=%5BML%20Reproducibility%20Challenge%202020%5D(%2Fgroup%3Fid%3DML_Reproducibility_Challenge%2F2020)
- Supplementary Material: zip