# Codebase Overview

## Hyperparameters
- Hyperparameters for the experiments we ran on the Qwen3 models can be found in the folder named `hyperparam_files`.  
- The hyperparameters for the gpt-oss-20b model can be found in the bash script in the directory `experiment_scripts/eval`. 

## Prompt-finetuning

- All relevant python scripts for prompt-finetuning are under the `experiment_scripts/eval` directory.
- The prompts for the four different representations can be found in the `prompts` folder under their respective names. 

    > **Note:**
    > Throughout the codebase, "Semantic Triples" are referred to as `knowledge_graphs`.
- To run the `evaluate_cs_models.py` script, refer to the `batch_eval.sh` file. 
- The `fetch_batch.py` script is intended for when the `evaluate_cs_models.py` is interrupted after the batch has been uploaded to the API.

## Fine-tuning

- The fine-tuning script can be found under `experiment_scripts/finetuning` directory.
-  Once again the hyperparameters can be found in the `hyperparam_files` folder in `training_hyperparams.yml`. 
- The hypeparameters used for the evaluation of the fine-tuned models can be found in `finetune_eval.yml`. 

## Dataset Generation

- All relevant scripts to dataset generation can be found in the `dataset` folder. 
- The final csv files for the train, validation, and test splits of the ALIST8.5K dataset are under the `data` folder along with the datasets used for generation of the alists. 
- Prompts for generation can be found at `prompts/generation.json`.
- To run the generation of alist samples (either from pre-existing natural language or from scratch) refer to the `generate.sh` file. 
## Analysis

- The code for calculating the adherence scores of the model outputs can be found in `adherence.py` in the `experiment_scripts/eval` folder. 

