Abstract: Highlights•We propose a pure transformer, TFormer, for multi-modal skin lesion diagnosis.•We design a dual-branch HMT block for image modalities fusion.•A “divide and conquer” strategy is adopted to tackle fusions between modalities.•The proposed TFormer achieves state-of-the-art performance on a benchmark dataset.
Loading