EVA-02: A visual representation for neon genesis

Published: 01 Jan 2024, Last Modified: 22 Sept 2024Image Vis. Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•EVA-02, a plain Transformer-based visual representation, demonstrates superior performance in various vision tasks.•EVA-02 reduces model size through robust optimization, advanced activation functions, and position embedding.•EVA-02 achieves 90.0 fine-tuning top-1 accuracy on ImageNet-1K with only 304 M parameters.•EVA-02-CLIP outperforms the best open-sourced CLIP in zero-shot ImageNet-1K classification, using less training data.
Loading