Abstract: The evolution of IoT malware has ignited interest in the creation of malware family classification models. Nonetheless, these models encounter security concerns stemming from issues related to their interpretability and vulnerabilities exposed within the training pipeline. Recent research highlighted the limitations of learning-based malware classifiers, which are susceptible to backdoor attacks due to relying on human-engineered features to simplify the mapping from features to binary perturbations. In contrast, our study aligns with the current trajectory of the malware classification field, where we emphasize the detection of backdoor attacks targeted at models employing features extracted from within the model itself. To thoroughly assess model vulner-abilities, we have devised a dynamic trigger generation method based on sample features, which we refer to as “BENIGN”. This approach is used to contaminate and launch attacks on the model while also implementing a tailored training process to achieve specific attack objectives. Through experiments, we analyze the impact of variables involved in its training procedures on the attack stability and success rates. Last, we evaluate mitigation methods and emphasize the challenges and adaptability needed to defend against these attack strategies.
Loading