Abstract: High-resolution images play an essential role in the performance of image analysis and pattern recognition methods. However, the expensive setup required to generate them and the inherent limitations of the sensors in optics manufacturing technology leads to the restricted availability of these images. In this work, we exploit the information retrieved in feature maps using the notable VGG networks and apply a transformer network to address spatial rigid affine transformation invariances, such as translation, scaling, and rotation. To evaluate and compare the performance of the model, three publicly available datasets were used. The model achieved very gratifying and accurate performance in terms of image PSNR and SSIM metrics against the baseline method.
Loading