Abstract: Adversarial attacks have emerged as a critical threat to autonomous driving systems.
These attacks exploit the underlying neural network, allowing small -- almost invisible -- perturbations to alter the behavior of such systems in potentially malicious ways,
*e.g.*, causing a traffic sign classification network to misclassify a stop sign as a speed limit sign.
Prior work in hardening such systems against adversarial attacks has looked at fine-tuning of the system or adding additional pre-processing steps to the input pipeline.
Such solutions either have a hard time generalizing, require knowledge of adversarial attacks during training, or are computationally undesirable.
Instead, we propose a framework called *Elytra* to take insights for parameter-efficient fine-tuning and use low-rank adaptation (LoRA) to train a lightweight security patch (or patches), enabling us to dynamically patch large pre-existing vision systems as new vulnerabilities are discovered.
We demonstrate that the *Elytra* framework can patch pre-trained large vision models to improve classification accuracy by up to 24.09% in the presence of adversarial examples.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yinpeng_Dong2
Submission Number: 7816
Loading