# Data preparation
Please first specify the dataset(SA-1B, LHQ, ...) roots in `custom_datasets/mypath.py`
# Art filtering
Assume we have downloaded SA-1B dataset in a folder structure like:
```
sam_dataset
├── captions
│   ├── 0.txt
│   ├── 1.txt
│   └── ...
├── images
│   ├── sa_000000
│     ├── 0.jpg
│     ├── 1.jpg
│     └── ...
│   ├── sa_000001
│     ├── 0.jpg
│     ├── 1.jpg
│     └── ...
│   └── ...
└── 
```

## Caption level filtering
``` shell
python custom_datasets/filt/sam_filt.py --mode caption_filt
```
## Image level filtering
``` shell
python custom_datasets/filt/sam_filt.py --mode clip_logit
python custom_datasets/filt/sam_filt.py --mode clip_filt
```

## Gather filtering results
After caption level and image level filtering, we finally gather all results:
``` shell
python custom_datasets/filt/sam_filt.py --mode gather_result
```
# Train artistic adaptor
## Data preparation
For specific artistic style, the dataset should be prepared in the following format:
```
artist_name
├── captions
│   ├── a.txt
│   ├── b.txt
│   └── ...
├── paintings

│   ├── a.jpg/png
│   ├── b.jpg/png
│   └── ...
└── style_label.txt
```
The `style_label.txt` file should be: artist_name, style_name. For example, `Gustav Klimt, Art Nouveau`

## Training
The training script is `train_artistic.py`. The following arguments should be specified:
- `--style_folder`: the path to the folder containing the dataset
- `--save_path`: the path to save the trained adaptor
- `--rank`: the rank of the adaptor
- `--iterations`: the number of iterations to train the adaptor


For example,
``` shell
python train_artistic.py --style_folder <style_folder> --save_path <save_path> --rank 1 --iterations 1000
```
The trained adaptor will be saved in the format like `<save_path>/single_scale_alpha1.0_rank1_full_1000steps.pt`


# Inference with trained artistic adaptor
## Art Generation
For art generated from scratch, the arg `--from_scratch` should be specified. The arg `--start_noise` specifies time step(0~1000) that adaptor starts to be incorporated into the generation. The larger the value, the more artistic the result will be. Value ``-1`` means adaptor is used in the whole process. Customize prompts can be specified by `--infer_prompts`. For example,
``` shell
python inference.py --lora_weights <lora_location> --from_scratch --start_noise -1 --infer_prompts "Sunset over the ocean with waves and rocks" --save_dir <save_location>
```

For style transfer, the arg `--val_set` specifies the name of the validation set(lhq9/lhq500). For example,
## Image Stylization
``` shell
python inference.py --lora_weights <lora_location> --start_noise 600 --val_set lhq9 --save_dir <save_location>
```