TL;DR: This work focuses on fundamental problems of diversity in adversarial ensemble learning, proposes the first diversity decompositions in adversarial environment and introduces a new robust ensemble method to enhance these diversities.
Abstract: Diversity has been one of the most crucial factors on the design of adversarial ensemble methods. This work focuses on the fundamental problems: How to define the diversity for the adversarial ensemble, and how to correlate with algorithmic performance. We first show that it is an NP-Hard problem to precisely calculate the diversity of two networks in adversarial ensemble learning, which makes it different from prior diversity analysis. We present the first diversity decomposition under the first-order approximation for the adversarial ensemble learning. Specifically, the adversarial ensemble loss can be decomposed into average of individual adversarial losses, gradient diversity, prediction diversity and cross diversity. Hence, it is not sufficient to merely consider the gradient diversity on the characterization of diversity as in previous adversarial ensemble methods. We present diversity decomposition for classification with cross-entropy loss similarly. Based on the theoretical analysis, we develop new ensemble method via orthogonal adversarial predictions to simultaneously improve gradient diversity and cross diversity. We finally conduct experiments to validate the effectiveness of our method.
Lay Summary: In real-world applications, machine learning models can be easily fooled by adversarial examples—specially crafted inputs that look almost identical to normal data but lead the model to make wrong predictions. This is a serious concern in high-stakes fields like healthcare, finance, and autonomous driving.
Ensemble learning is a popular strategy to improve robustness against such attacks, which combines multiple models to make better decisions. Diversity has always been one of the most crucial factors in the design of ensemble methods. However, it is challenging and still poorly understood to define and measure the diversity in adversarial ensemble learning. This work focuses on two fundamental problems: How to define the diversity for the adversarial ensemble, and how to correlate with algorithmic performance.
We first prove that it is NP-Hard to precisely calculate the diversity of neural networks in adversarial ensemble learning is NP-hard, meaning it’s extremely difficult computationally. To overcome this, we take the first-order approximation and decompose the adversarial ensemble loss into four components: the average loss of individual models, prediction diversity (how differently models predict), gradient diversity (how different their gradients), cross diversity (how predictions and gradients interact). This is the first decomposition for diversity in adversarial ensemble learning, and it reveals that focusing only on gradient diversity—as many past methods do—is insufficient.
Based on this analysis, we introduce a new method called AdvEOAP (Adversarial Ensemble via Orthogonal Adversarial Predictions). It trains models so that their predictions on adversarial examples are orthogonal, boosting both gradient diversity and cross diversity. Experiments on standard datasets (MNIST, Fashion-MNIST, CIFAR-10) show that AdvEOAP significantly outperforms previous methods under a wide range of adversarial attacks.
Primary Area: General Machine Learning->Everything Else
Keywords: Adversarial learning, Ensemble learning, Diversity
Submission Number: 6471
Loading