Exploring the synergies of hybrid convolutional neural network and Vision Transformer architectures for computer vision: A survey

Published: 01 Jan 2025, Last Modified: 03 Mar 2025Eng. Appl. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Overview of CNN and ViT evolution, highlighting strengths in CV tasks.•Taxonomy of hybrid CNN-ViT models: parallel, serial, early and late fusion, MHSA.•Comparative analysis and application of hybrid designs for CV tasks.•Challenges: design complexity, computational cost, and future research.•Research conclusion for researchers and developers.
Loading