ReViT: A Hybrid Approach for BCLC Staging of Hepatocellular Carcinoma Using 3D CT with Multiple Instance Learning

Published: 25 Sept 2024, Last Modified: 24 Oct 2024IEEE BHI'24EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Medical Imaging, Hepatocellular Carcinoma, BCLC staging, Convolutional Neural Network, Vision Transformer, Multiple Instance Learning
TL;DR: ReViT: A hybrid CNN-ViT model leveraging MIL for enhanced BCLC staging of HCC using 3D CT images, featuring innovative Masked Cropping and Padding preprocessing.
Abstract: Deep learning has revolutionized medical imaging, offering advanced methods for accurate diagnosis and treatment planning. The BCLC staging system is crucial for staging Hepatocellular Carcinoma (HCC), a high-mortality cancer. An automated BCLC staging system could significantly enhance diagnosis and treatment planning efficiency. However, we found that BCLC staging, which is directly related to the size and number of liver tumors, aligns well with the principles of the Multiple Instance Learning (MIL) framework. To effectively achieve this, we proposed a new preprocessing technique called Masked Cropping and Padding(MCP), which addresses the variability in liver volumes and ensures consistent input sizes. This technique preserves the structural integrity of the liver, facilitating more effective learning. Furthermore, we introduced ReViT, a novel hybrid model that integrates the local feature extraction capabilities of Convolutional Neural Networks (CNNs) with the global context modeling of Vision Transformers (ViTs). ReViT leverages the strengths of both architectures within the MIL framework, enabling a robust and accurate approach for BCLC staging. We will further explore the trade-off between performance and interpretability by employing TopK Pooling strategies, as our model focuses on the most informative instances within each bag.
Track: 4. AI-based clinical decision support systems
Registration Id: 43N8W9F4YN5
Submission Number: 150
Loading