GARE-Net: Geometric contextual aggregation and regional contextual enhancement network for image-text matching
Abstract: Highlights•A novel GARE-Net consists of two key modules GCFA and RCFE for image-text matching.•Geometric Contextual Feature Aggregation (GCFA) tells where a given region is within the image.•Regional Contextual Feature Enhancement (RCFE) reflects what and where surrounding regions are.
External IDs:dblp:journals/eswa/ZhongZCZ26
Loading