Evolution-aware VAriance (EVA) Coreset Selection for Medical Image Classification

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In the medical field, managing high-dimensional massive medical imaging data and performing reliable medical analysis from it is a critical challenge, especially in resource-limited environments such as remote medical facilities and mobile devices. This necessitates effective dataset compression techniques to reduce storage, transmission, and computational cost. However, existing coreset selection methods are primarily designed for natural image datasets, and exhibit doubtful effectiveness when applied to medical image datasets due to challenges such as intra-class variation and inter-class similarity. In this paper, we propose a novel coreset selection strategy termed as Evolution-aware VAriance (EVA), which captures the evolutionary process of model training through a dual-window approach and reflects the fluctuation of sample importance more precisely through variance measurement. Extensive experiments on medical image datasets demonstrate the effectiveness of our strategy over previous SOTA methods, especially at high compression rates. EVA achieves 98.27\% accuracy with only 10\% training data, compared to 97.20\% for the full training set. None of the baseline methods compared can exceed Random at 5\% selection rate, while EVA outperforms Random by 5.61\%, showcasing its potential for efficient medical image analysis.
Primary Subject Area: [Engagement] Summarization, Analytics, and Storytelling
Secondary Subject Area: [Experience] Interactions and Quality of Experience
Relevance To Conference: Our work contributes to multimedia/multimodal processing by introducing Evolution-aware VAriance (EVA), a novel coreset selection strategy that enhances the efficiency and effectiveness of handling large-scale image dataset and can be extended to multimodal dataset compression. By precisely identifying key data instances through variance measurement, EVA ensures that the compressed dataset retains essential information, facilitating more accurate and faster processing. Our extensive experiments demonstrate EVA's effectiveness, particularly at high compression rates. This advancement is crucial for engaging users with multimedia content and enhancing the user experience, especially in applications requiring real-time monitoring and analysis, such as in remote medical facilities and mobile devices.
Supplementary Material: zip
Submission Number: 4896
Loading