Abstract: The robustness of neural networks in image classification is important to resist adversarial attacks. Although many researchers proposed to enhance the network robustness by inventing network training paradigms or designing network architectures, existing approaches are mainly based on a single type of networks, e.g., convolution neural networks (CNNs) or vision Transformer (ViT). Considering a recently revealed fact that CNNs and ViT can effectively defend against adversarial attacks transferred from each other, this paper aims to enhance network robustness by designing robust hybrid architecture networks containing different types of networks. To this end, we propose a hybrid architecture-based evolutionary neural architecture search approach for robust architecture design, termed HA-ENAS. Specifically, to combine or aggregate different types of networks in the same network framework, a multi-stage block-wise hybrid architecture network is first devised as the supernet, where three types of blocks (called convolution blocks, Transformer blocks, multi-layer perception blocks) are further designed as each block's candidate, and thus a hybrid architecture-based search space is established for HA-ENAS; then, the robust hybrid architecture search is formulated as an optimization problem maximizing both clean and adversarial accuracy of architectures, and an efficient multi-objective evolutionary algorithm is employed to solve the problem, where a supernet-based retraining evaluation and a surrogate model are used to mitigate coupled weight influence and reduce the whole search cost. Experimental results show that the hybrid architectures found by the proposed HA-ENAS outperform state-of-the-art single-type architectures in terms of clean accuracy and adversarial accuracy under a variety of common attacks.
Loading