# HowTo

## SVD for EVA
before finetuning eva, you need to create an SVD checkpoint. (the variable pca_filepath in train.sh needs to point to an existing SVD checkpoint)
- create a conda environment with environment_eva_llm.yaml
- Set variables in run_pca_precompute.sh such as base_path, model_names and dataset_name

## Finetuning
- create a conda environment with environment_eva_llm.yaml
- Set variables in run_train.sh such as base_path, model_names and dataset_name

## Evaluation
- For evaluation on GSM8K and MATH, run the respective scripts run_eval_gsm8k.sh and run_eval_math.sh.
- For evaluation on common sense reasoning tasks run lm_eval_harness.sh. For this to work you need to first clone the lm-eval-harness [repo](https://github.com/EleutherAI/lm-evaluation-harness), copy the custom tasks inside the lm-evaluation-harness folder in this repository into the tasks folder of the lm-eval-harness repo and then install it via 'pip install -e .'

# dataset finetuning config

## meta-math/MetaMathQA
- num_examples_train 395000
- filter_long_context_examples False
- model_max_length 512
- batch_size 4
- gradient_accumulation_steps 2
- learning_rate 5e-4
- epochs 1

## Rowan/hellaswag
- num_examples_train 39905
- filter_long_context_examples False
- model_max_length 512
- batch_size 4
- gradient_accumulation_steps 2
- learning_rate 5e-4
- epochs 1

## allenai/winogrande
- num_examples_train 40398
- filter_long_context_examples False
- model_max_length 128
- batch_size 8
- gradient_accumulation_steps 1
- learning_rate 5e-4
- epochs 1

## allenai/ai2_arc_challenge
- num_examples_train 1119
- filter_long_context_examples False
- model_max_length 320
- batch_size 8
- gradient_accumulation_steps 1
- learning_rate 5e-4
- epochs 3

## allenai/ai2_arc_easy
- num_examples_train 2251
- model_max_length 256
- filter_long_context_examples False
- batch_size 8
- gradient_accumulation_steps 1
- learning_rate 5e-4
- epochs 3

## ybisk/piqa
- num_examples_train 16113
- model_max_length 2048
- filter_long_context_examples True
- batch_size 4
- gradient_accumulation_steps 2
- learning_rate 5e-4
- epochs 1

## allenai/social_i_qa
- num_examples_train 33410
- model_max_length 192
- filter_long_context_examples False
- batch_size 8
- gradient_accumulation_steps 1
- learning_rate 5e-4
- epochs 1

## allenai/openbookqa
- num_examples_train 4957
- model_max_length 192
- filter_long_context_examples False
- batch_size 8
- gradient_accumulation_steps 1
- learning_rate 5e-4
- epochs 2

## boolq
- num_examples_train 9427
- model_max_length 768
- filter_long_context_examples False
- batch_size 4
- gradient_accumulation_steps 2
- learning_rate 5e-4
- epochs 1

## qa_datasets
- model_max_length 1024
- filter_long_context_examples True
- batch_size 4
- gradient_accumulation_steps 2
- learning_rate 5e-4
- epochs 1