Keywords: transformer, brain tumour segmentation
TL;DR: VTBIS: Vision Transformer for Biomedical Image Segmentation
Abstract: In this paper, we propose a novel network named Vision
Transformer for Biomedical Image Segmentation (VTBIS).
Our network splits the input feature maps into three parts
with 1 × 1, 3 × 3 and 5 × 5 convolutions in both encoder
and decoder. Concat operator is used to merge the features
before being fed to three consecutive transformer blocks
with attention mechanism embedded inside it. Skip con-
nections are used to connect encoder and decoder trans-
former blocks. Similarly, transformer blocks and multi scale
architecture is used in decoder before being linearly pro-
jected to produce the output segmentation map. We test
the performance of our network using Synapse multi-organ
segmentation dataset, Automated cardiac diagnosis chal-
lenge dataset, Brain tumour MRI segmentation dataset and
Spleen CT segmentation dataset. Without bells and whis-
tles, our network outperforms most of the previous state
of the art CNN and transformer based models using Dice
score and the Hausdorff distance as the evaluation met-
rics.
3 Replies
Loading