<div align="center">

Essentially a simple "copy-paste" from OpenAI CLIP [ [code](https://github.com/openai/CLIP/tree/main/clip) \| [paper](https://arxiv.org/abs/2103.00020) \| [blog](https://openai.com/blog/clip/) ].

Here, the CLIP vision feature is used as the MIM pre-training target of EVA. 

</div>