Abstract: Recently, dynamic convolution shows performance boost for the CNN-related networks in medical image segmentation. The core idea is to replace static convolutional kernel with a linear combination of multiple convolutional kernels, conditioned on input-dependent attention function. However, the existing dynamic convolution design suffers from two limitations: i) The convolutional kernels are weighted by enforcing a single-dimensional attention function upon the input maps, overlooking the synergy in multi-dimensional information. This results in sub-optimal computations of convolution kernels. ii) The linear kernel aggregation is inefficient, restricting the model’s capacity to learn more intricate patterns. In this paper, we rethink the dynamic convolution design to address these limitations and propose multi-dimensional aggregation dynamic convolution (MAGIC). Specifically, our MAGIC introduce a dimensional-reciprocal fusion module to capture correlations among input maps across the spatial, channel, and global dimensions simultaneously for computing convolutional kernels. Furthermore, we design kernel recalculation module, which enhances the efficiency of aggregation through learning the interaction between kernels. As a drop-in replacement for regular convolution, our MAGIC can be flexibly integrated into prevalent pure CNN or hybrid CNN-Transformer backbones. The extensive experiments on four benchmarks demonstrate that our MAGIC outperforms regular convolution and existing dynamic convolution. Code is available at: https://github.com/Segment82/MAGIC
Primary Subject Area: [Content] Vision and Language
Relevance To Conference: Automatically segmenting various data modalities (such as CT and MRI scan) is one of the most fundamental yet challenging tasks in medical image analysis. Our work introduces a novel convolutional operator that significantly enhances the accuracy and efficiency of medical image segmentation algorithms. This advancement represents a significant contribution to the ACM Multimedia conference by bridging the gap between multimedia technologies and healthcare applications. Our work showcases the potential of multimedia technologies to facilitate complex medical image analysis. This contribution underscores the importance of ACM Multimedia as a platform for fostering innovations that have real-world impacts, particularly in improving the precision of medical diagnoses.
Supplementary Material: zip
Submission Number: 1784
Loading