Keywords: adversarial attack, scene text recognition, evolution algorithm, computer vision
TL;DR: We propose a pixel-level black-box attack method to fool STR models to predict more incorrect characters, using a novel multi-population coevolution search algorithm.
Abstract: Recent work has shown that scene text recognition (STR) models are vulnerable to adversarial examples.
Different from non-sequential vision tasks, the output sequence of STR models contains rich information.
However, existing adversarial attacks against STR models can only lead to a few incorrect characters in the predicted text.
These attack results still carry partial information about the original prediction and could be easily corrected by an external dictionary or a language model.
Therefore, we propose the Multi-Population Coevolution Search (MPCS) method to attack each character in the image.
We first decompose the global optimization objective into sub-objectives to solve the attack pixel concentration problem existing in previous attack methods.
While this distributed optimization paradigm brings a new joint perturbation shift problem, we propose a novel coevolution energy function to solve it.
Experiments on recent STR models show the superiority of our method.
The code is available at \url{https://github.com/Lee-Jingyu/MPCS}.
Supplementary Material: zip
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 27154
Loading