# Graph2SMILES sample code
The code here would reproduce the results for Graph2SMILES (D-GAT) on USPTO_50k.

# Environmental setup
Ensure that conda is installed on the machine (i.e. conda init is runnable). Then
```
bash -i ./scripts/setup.sh
```

# Preprocess
### Step 1 (Optional)
Convert the raw train, validation and test files (originally from GLN repo) into source and target tokens.
```
python -m data.schneider50k.csv2txt
```
The files are pre-computed so this step is optional.

### Step 2
Compute the graph and sequence features and binarize. 
```
sh scripts/preprocess.sh
```

# Train
```
sh scripts/train_g2s.sh
```

# Validation
Once training is done, run validation by first specifying the checkpoint folder in scripts/validate.sh, then
```
sh scripts/validate.sh

```
# Test
Select the best checkpoint based on validation results to be configured in scripts/predict.sh, then
```
sh scripts/predict.sh
```
