L-AutoDA: Large Language Models for Automatically Evolving Decision-based Adversarial Attacks

Published: 01 Jan 2024, Last Modified: 30 Sept 2024GECCO Companion 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the rapidly evolving field of machine learning, adversarial attacks pose a significant threat to the robustness and security of models. Amongst these, decision-based attacks are particularly insidious due to their nature of requiring only the model's decision output, which makes them notably challenging to counteract. This paper presents L-AutoDA (Large Language Model-based Automated Decision-based Adversarial Attacks), an innovative methodology that harnesses the generative capabilities of large language models (LLMs) to streamline the creation of such attacks. L-AutoDA employs an evolutionary strategy, where iterative interactions with LLMs lead to the autonomous generation of potent attack algorithms, thereby reducing human intervention. The performance of L-AutoDA was evaluated on the CIFAR-10 dataset, where it demonstrated substantial superiority over existing baseline methods in terms of success rate and computational efficiency. Ultimately, our results highlight the formidable utility of language models in crafting adversarial attacks and reveal promising directions for constructing more resilient AI systems.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview