FILIP: Fine-grained Interactive Language-Image Pre-TrainingDownload PDFOpen Website

2022 (modified: 03 Nov 2022)ICLR 2022Readers: Everyone
Abstract: Unsupervised large-scale vision-language pre-training has shown promising advances on various downstream tasks. Existing methods often model the cross-modal interaction either via the similarity of...
0 Replies

Loading