Abstract: Histopathology analysis is the gold standard for medical diagnosis.Accurate classification of whole slide images (WSIs) and region-of-interests (ROIs) level localization will assist pathologists in clinical diagnosis. With a gigapixel resolution and a scarcity of fine-grained annotations, WSI is difficult to classify directly. In the field of weakly supervised learning, multiple instance learning (MIL) serves as a promising approach to solving WSI classification tasks. Currently, a prevailing aggregation strategy is to apply attention mechanism as a measure of the importance of each instance for further classification. Notwithstanding, attention mechanism fails to capture inter-instance information and self-attention mechanism can cause quadratic computational complexity issues. To address these challenges, we propose an agent aggregator with mask denoise mechanism for multiple instance learning termed AMD-MIL. The agent token represents an intermediate variable between the query and key for implicit computation of the instance importance. Mask and denoising are also learnable matrices mapped from the agents-aggregated value, which first dynamically mask out some low-contribution instance representations and then eliminate the relative noise introduced during the mask process. AMD-MIL can indirectly achieve more reasonable attention allocation by adjusting feature representations, thereby sensitively capturing micro-metastases in cancer and achieving better interpretability. Our extensive experiments on CAMELYON-16, CAMELYON-17, TCGA-KIDNEY, and TCGA-LUNG datasets show our method’s superiority over existing state-of-the-art approaches. The code will be available upon acceptance
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Experience] Multimedia Applications, [Content] Media Interpretation
Relevance To Conference: We introduce the AMD-MIL method, aimed at enhancing the classification and analysis of whole slide images (WSIs), especially in clinical diagnosis and cancer detection applications. This work makes several key contributions, summarized as follows. (1) A novel agent aggregator is proposed. By introducing an agent token as an intermediary variable between the query and key, the AMD-MIL can perform global modeling under the condition of approximate linear complexity, and assign an importance score to each instance.This approach not only enhances computational efficiency but also maintains the precision of the analysis. (2) Dynamic masking and denoising are introduced. The AMD-MIL, through learnable mask and denoising matrices,dynamically masks some low-contribution instance representations, then eliminates the relative noise introduced during the mask process. This method can indirectly achieve more reasonable attention distributions by adjusting feature representations, thereby sensitively capturing micro-metastases in cancer and achieving better interpretability. To conclude,our method significantly contributes to the construction of standardized process in histopathology analysis, which can also expand to multiple staining modalities.
Supplementary Material: zip
Submission Number: 3920
Loading