Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual ConceptsDownload PDFOpen Website

2022 (modified: 24 Apr 2023)ICML 2022Readers: Everyone
Abstract: Most existing methods in vision language pre-training rely on object-centric features extracted through object detection and make fine-grained alignments between the extracted features and texts. I...
0 Replies

Loading