Abstract: Self-supervised pretraining has shown impressive performance in recent years, matching or even outperforming ImageNet weights on a broad range of downstream tasks. Unfortunately, existing methods require massive amounts of computing power with large batch sizes and batch norm statistics synchronized across multiple GPUs. This effectively excludes substantial parts of the computer vision community from the benefits of self-supervised learning who do not have access to extensive computing resources.
Loading