Exploring the synergies of hybrid convolutional neural network and Vision Transformer architectures for computer vision: A survey

Yunusa Haruna, Shiyin Qin, Abdulrahman Hamman Adama Chukkol, Abdulganiyu Abdu Yusuf, Isah Bello, Adamu Lawan

Published: 01 Mar 2025, Last Modified: 07 Nov 2025Engineering Applications of Artificial IntelligenceEveryoneRevisionsCC BY-SA 4.0
Abstract: Highlights•Overview of CNN and ViT evolution, highlighting strengths in CV tasks.•Taxonomy of hybrid CNN-ViT models: parallel, serial, early and late fusion, MHSA.•Comparative analysis and application of hybrid designs for CV tasks.•Challenges: design complexity, computational cost, and future research.•Research conclusion for researchers and developers.
Loading