Hi-End-MAE: Hierarchical encoder-driven masked autoencoders are stronger vision learners for medical image segmentation
Abstract: Highlights•Introduces encoder-driven reconstruction for high-quality representation learning.•Hierarchical dense decoding captures multi-layer anatomical representations.•Revealing the huge potential of Vision Transformer in medical image applications.•Achieves superior performance across seven medical segmentation benchmarks.
External IDs:doi:10.1016/j.media.2025.103770
Loading