# One Shot, One Kill: Attacking Video Object Segmentation with a Single Frame

## Requirements
   * Python3
   * pytorch >= 1.7.0 and torchvision
   * opencv-python
   * Pillow
   * tqdm
   * numpy
   * scikit-image
   * Pytorch Correlation
     ```bash
     git clone https://github.com/ClementPinard/Pytorch-Correlation-extension.git
     cd Pytorch-Correlation-extension
     python setup.py install
     ```

## Getting Started
0. Prepare a valid environment follow the [requirements](#requirements).

1. Prepare datasets:

    Please follow the below instruction to prepare datasets in each corresponding folder.

    * **YouTube-VOS**

        [datasets/YTB/2019](datasets/YTB/2019): version 2019, download [link](https://codalab.lisn.upsaclay.fr/competitions/6066#participate-get-data). 

    * **DAVIS**

        [datasets/DAVIS](datasets/DAVIS): [TrainVal](https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-trainval-480p.zip) (480p) contains both the training and validation split. [Test-Dev](https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-test-dev-480p.zip) (480p) contains the Test-dev split.


2. Prepare ImageNet pre-trained encoders:

    Download [MobileNet-V2](https://download.pytorch.org/models/mobilenet_v2-b0353104.pth) into [pretrain_models](pretrain_models):

3. Generate poisoned datasets:
    * **YouTube-VOS**
   
        Generate poisoned YouTube-VOS test by running [poison_ytb.py](poison_ytb.py)
    * **DAVIS** 
   
        Generate poisoned DAVIS test by running [poison_dav.py](poison_dav.py)

4. Training and Evaluation

    The [example script](train_eval.sh) will train AOTT with on YouTube-VOS 2019 using 4 GPUs. 
     ```bash
   sh train_eval.sh
     ```





