This document provides additional results and analysis for our study in the main paper.
In the first experiment, we applied the adversarial attacks aginst TransT-SEG and MixFormerM, and as a result, we created a video of the output of the tracker before (Green Mask/BBOX) and after the attack (Red Mask/BBOX) .
The white-box attacks are more effective against TransT tracker whether the evaluation is based on the bounding box or the binary mask .
Black-box attacks against TransT-SEG
White-box attacks against TransT-SEG
Black-box attacks against MixFormerM
In this section, we applied the adversarial attacks aginst TransT, and as a result, we created a series of video using the perturbed search regions and perturbation maps in different perturbation levels for the white-box approaches: SPARK and RTAA. The search regions after the attack may show different areas of the same frame, depending on the effect of each attack and bounding box degradation.
Any perturbed region with SSIM lower than 50% is considered as a super-perturbed region. In lower perturbation levels, the perceptibility of the generated perturbations are greater while in higher levels, the number of super-perturbed frames are inscreased.
Perturbed search regions and Perturbation maps: ε = 2.55
Perturbed search regions and Perturbation maps: ε = 5.1
Perturbed search regions and Perturbation maps: ε = 10.2
Perturbed search regions and Perturbation maps: ε = 20.4
Perturbed search regions and Perturbation maps: ε = 40.8
We have created video sequences by using the original tracking sequences as a base. These videos are generated by attacking the ROMTrack tracker with IoU method in different levels of the perturbation.
.
Perturbed Frame: ζ = 8k
Perturbed Frame: ζ = 10k
Perturbed Frame: ζ = 12k
Perturbation Map: ζ = 8k
Perturbation Map: ζ = 10k
Perturbation Map: ζ = 12k