# Installing

### Virtual Environment and Dependencies

Set up the virtual environment and activate
```bash
python -m venv venv

# Windows:
./venv/scripts/activate

# Linux:
source ./venv/bin/activate
```

Install dependencies:
```bash
pip  install -r requirements.txt
```

### LLM client configuration

Different LLMService client classes require varying forms of credentials and URLs. The configuration files must be stored in the ```.local``` directory and be configured as described below.

#### Services and their config files:

##### VertexLLMService - .local/vertex.json
```json

{
    "api_key": "your API key here",
    "model": "gemini-1.5-pro",
    "max_tokens": 1000000,
    "temperature": 0,
    "top_p": 1,
    "top_k": 0,
    "frequency_penalty": 0,
    "presence_penalty": 0
}
```

##### Local LLM API - eg ./local/ollama.json
```json
{
    "api_url": "your API url here",
    "model": "llama3.3",
    "max_tokens": 8000,
    "temperature": 0,
    "top_p": 1,
    "top_k": 0,
    "frequency_penalty": 0,
    "presence_penalty": 0
}
```

# Running

## First, generating Problems, using protocol_utils

Although we have already included a default 1000 problems in the data/oproblems folder, alternate problems can be generated using .src/protocol_utils. 

protocol_utils has methods for generating 'tranches' for training, testing, and validation models. However, we only need problems for now to benchmark model performance. 

The key method to make a tranche is protocol_utils.make_tranche which given a TrancheConfig cfg, seed, integer n, and out_dir path will generate a tranche of n many problems, according to the cfg settings and seed values.

The TrancheConfig dataclass for now is limited to @dataclass 
class TrancheConfig:
    # High-level controls
    name: str                      
    problem_kwargs: Dict[str, Any] 

The problem_kwargs can be found in the .src/symb/Problem.py file.




The key method to make a new problem is to use is sample_problem, which given a TrancheConfig, and rng_seed set up, proceeds to generate a specific sample_problem (as an alternative to directly generating a new instance of the Problem class). 


## Second, running the baseline llm experiments

There are two python functions included in the src folder: `baseline_llm_problem_answer.py` and `baseline_llm_nlp4opt_qa`. Each takes one argument, a string that must be from the hard-coded list [`llama-3.3`,`llama-4`,'gemini`]. This list can be amended by the users as they wish, but should have a corresponding LLMService, found in the LLMServices folder in order to function properly. `llama-3.3` corresponds to the Ollama LLMService, `llama-4` the Vllama LLMService, and `gemini` to the Vertex LLMService.

The outputs of these functions will be stored in the following subfolders of the data folder:
- `baseline_0shot` and `baseline_0shot_with_sym` store the outputs of the `baseline_llm_problem_answer.py` function  
- `baseline_0shot_nlp4lp` and `baseline_0shot_with_sym_nlp4lp` store the outputs of the `baseline_llm_nlp4opt_qa.py` function  

Once all configurations have been run to user satisfaction, run the `experiment_analysis.py` function to generate analysis outputs which should save to the analysis folder and the main folder.

