# convert_domain  

Domain adaptation of text2sql data benchmark for Asset


## 1. Converting Domain
Run convert_domain.py.
1. pip install -r requirements.txt : 
2. Download [spider_data](https://yale-lily.github.io/spider) and unzip
3. Get the abstract graph for [asset](<redacted>)
4. Setup .env to contain watsonx and rits keys. 
5. Set source to point to the root directory of spider after expanding
6. python convert_domain.py -h to get the parameter explanations
7. python convert_domain.py -g graph.json -s spider_data/dev.json --row 1(if you want to run a trial, then remove this param) -r foo.csv
8. Output saved to foo.csv


## 2. Validating examples
To run LLM evaluation on the output of convert_domain.py, run validate_examples.py.
Place the data under `data/` in CSV form, with the TargetSQL and TargetQuestion as columns in the CSV. The program runs over each row and uses LLM as a judge to validate whether the NL question matches the SQL, and identifies reasons why it marks some as incorrect. 
Some things to note:
1. Column validation contains the "Correct" or "Incorrect" based on the evaluation of the LLM to say if the NL question accurately represents the SQL query
2. Column `explanations` is the reason for which the row has been marked incorrect. For example, missing_column means that the SQL query references a column in the select statement but is missing in the NL question, and so on. You can find a full list of explanations [here](<redacted link to maintain anonymity>)
3. The `missing_condition` flag seems to be overly sensitive, so you might as well consider those that are flagged with `missing_condition` to be correct. For example if the order of columns is switched or if the exact wording of a condition is not present, it gets flagged.
4. Correct rows are marked with `NA` on the explanation. If the explainer decides that a row is incorrectly marked as incorrect, the explanation says `correct`
5. In order to filter, just pick rows which have explanations in ["NA","correct","missing_condition"]
6. If you feel the evaluation is too strict by the LLM, remove some of these constraints in the prompts [validate_rules](<redacted link>) and [explain_rules](<redacted link>) and rerun [validate_examples.py](<redacted link>) 

## 3. Domain Conversion:
Use convert_domain/convert/convert_domain.py, see examples below:

*Please Note: Filenames used below are examples, these should be changed as needed to
run other domain conversions.*

**Spider-Dev to Asset**
```
./convert_domain/convert/convert_domain.py \
    --env env.mistral \
    --spider-file data/source/spider_data/dev.json \
    --target-graph data/target/asset/abstract_graph_13_objects.data.fk.json \
    --results-file data/generation/asset.spider.dev.20250728.csv \
        2>&1 | tee data/generation/asset.spider.dev.20250728.log
 ```

**Spider-Train to Asset**
```
./convert_domain/convert/convert_domain.py \
    --env env.mistral \
    --spider-file data/source/spider_data/train_spider.json \
    --target-graph data/target/asset/abstract_graph_13_objects.data.fk.json \
    --results-file data/generation/asset.spider.train.20250728.csv \
        2>&1 | tee data/generation/asset.spider.train.20250728.log
 ```

**Spider-Dev to Medical**
```
./convert_domain/convert/convert_domain.py \
    --env env.mistral \
    --spider-file data/source/spider_data/dev.json \
    --target-graph data/target/medical/abstract_graph_medical.json \
    --results-file data/generation/medical.spider.dev.20250728.csv \
        2>&1 | tee data/generation/medical.spider.dev.20250728.log
 ```

**Spider-Train to Medical**
```
 ./convert_domain/convert/convert_domain.py \
    --env env.mistral \
    --spider-file data/source/spider_data/train_spider.json \
    --target-graph data/target/medical/abstract_graph_medical.json \
    --results-file data/generation/medical.spider.train.20250728.csv \
        2>&1 | tee data/generation/medical.spider.train.20250728.log
 ```


## 4. Paper2 Pipeline_1 evaluation:
Use covert_domain/evaluate/run_paper2_pipeline1.sh, see examples below:

*Please Note: Filenames used below are examples, these should be changed as needed to
run other evaluations.*

**Target Domain Asset:**

```
./convert_domain/evaluate/run_paper2_pipeline1.sh \
    --env env.g328b \
    --src-db-dir data/source/spider_data/database \
    --query-file data/generation/asset.spider.dev.20250720.csv  \
    --target-db-file data/target/asset/asset.sqlite \
        2>&1 | tee data/generation/run_paper2_pipeline1.asset.spider.dev.202050720.log
```

**Target Domain Medical:**

```
./convert_domain/evaluate/run_paper2_pipeline1.sh \
    --env env.g328b \
    --src-db-dir data/source/spider_data/database \
    --query-file data/generation/medical.spider.dev.20250719.csv  \
    --target-db-file data/target/medical/medical.sqlite \
        2>&1 | tee data/generation/run_paper2_pipeline1.medical.spider.dev.202050719.log
```
