MODSC: Many-Objective-Optimization-Driven Data-Balancing Strategy in Cross-Architectural Malware Classification for Extreme IoT
Abstract: Stable operation of IoT systems in extreme environments is critical for infrastructures, including energy, aerospace, and healthcare. But resource-constrained devices are vulnerable to malware exploiting known weaknesses due to complex operating conditions and low-frequency device maintenance, and systems are at serious risk of failure or information leakage. However, automated security defense based on machine learning faces the challenges of low cross-architectural generalization and multidimensional data imbalance. We propose a many-objective-optimization-driven data balancing strategy for cross-architectural malware classification (MODSC) to address the above issues. MODSC constructs the optimization problem model for data balancing strategy search based on data set information, which focuses on rebalancing the data space in different dimensions, including category distribution and architectural distribution. The many-objective evolutionary algorithm (MaOEA-PBF), guided by the performance balance function, is then designed to solve the above model. The MODSC framework was tested on a group of IoT malware data sets with stepped imbalance rates containing six common architectures. The experimental results show that MODSC can both steadily and effectively improve cross-architectural generalization while maintaining a high level of confidence compared to popular data processing methods.
Loading