Abstract: We present a new model trained on multi-modalities of Positron Emission Tomography images (PET-AV45 and PET-FDG) for Alzheimer’s Disease (AD) diagnosis. Unlike the conventional methods using multi-modal 3D/2D CNN architecture, our design replaces the Convolutional Neural Net-work (CNN) by Vision Transformer (ViT). Considering the high computation cost of 3D images, we firstly employ a 3D-to-2D operation to project the 3D PET images into 2D fusion images. Then, we forward the fused multi-modal 2D images to a parallel ViT model for feature extraction, followed by classification for AD diagnosis. For evaluation, we use PET images from ADNI. The proposed model outperforms several strong baseline models in our experiments and achieves 0.91 accuracy and 0.95 AUC.
0 Replies
Loading