## Project Repository for Think Global and Act Local

**Xingchen Wan. 20 Jan 2021**

This repo contains my current implementation of the TurBO algorithm applied to a discrete (categorical) search space. 
It uses categorical overlap kernel combined with Hamming distance-based bounding boxes as well as local
search for acquisition optimization.

Currently we have benchmarks ```pest```, ```contamination``` and ```maxsat60``` as categorical problems, 
and ```Func2C```, ```Func3C```, ```Ackley53```, ```MNIST-XGBOOST``` and ```NasBench101``` from the CoCaBO paper, and
finally the image adversarial attack problem adapted from BayesOpt Adversarial Attack.

# Dependencies
```
gpytorch
pytorch
numpy
pandas
matplotlib
tqdm (for progress bar visualisation)
pydoe == 0.3.8 (To run image adversarial attack problem)
tensorflow (To run image adversarial attack problem)
xg-boost==0.90 (to run XGBoost on MNIST task)
...
(could be missing ones, if you find out please add it here)
```

# To run code
```bash
    python3 main.py -p pest -n 1 -a ei
```
where ```-p``` specifies the problem, ```-n``` specifies the number of trust region to initialise. Look at ```main.py``` to see other settings that can be tuned. ```-a``` specifies the acquisition function.

To run the adversarial attack problem, you have to additionally:
- Download the CIFAR-10 data in binary format (```cifar-10-batches-bin``) and place the folder in ```mixed_test_func/AdvAttack/tf_models```
Note that the adversarial attack uses a separate script, thus run:
  ```bash
  python3 run_adversarial_attack.py 
  ```

# Todo list
- [x] Integer dimensions -- apply wrapped Matern for e.g. on the integer dimensions
  
- [x] Batch setting for the mixed search space setting
- [x] Implement more problems. A starting point is the other problems in COMBO such as their simplified NAS,
- [x] Combined search spaces like CoCaBO: currently my implementation deals with purely categorical 
problems only.
  
- [x] Implement the guided restart strategy discussed for TurBO-1
  
- [x] Implement the guided restart strategy discussed for TurBO-M (but need more testing)
- [x] **ARD categorical kernels**: currently the categorical overlap kernel is the basic version mentioned in Notion,
as Vu and I investigated, disabling ARD for the original TurBO in continuous space leads to significant 
  performance degradation. Therefore, it is at least worth trying whether having some sort of ARD is helpful
  (I definitely think that would be the case)
  
  *Done*: this is rather simple in the gpytorch interface.
  
- [ ] Baselines: As a starting point, I suggest to first simply reproduce the results in COMBO (both by COMBO and the baselines
  such as SMAC/TPE etc). It might be also helpful to reproduce the CoCaBO results (literally lifted from Robin and Vu's previous codes)