Abstract: Highlights•A novel Omni-scale robust contrastive learning framework achieves SOTA.•A novel perception comprehension module extracts comprehensive image information.•Two novel contrastive learning approaches are proposed.•An answer generation module facilitates accurate prediction process.
Loading