# LookaheaKV Training / Inference Scripts

The files that were used to run the experiments can be found here.

- `lookaheadkv_inference.py` is used to run evaluations of our approach.
- `lookaheadkv_training.py` is used to train LookaheadKV modules

For training and inference, separate files defining the behavior of the model are required. For inference, we use `modeling_llama_lookaheadkv_inference.py` as the modeling file. Similarly, for training, we use `modeling_llama_lookaheadkv_training.py`.

Lastly, all theoretical latency experiments were conducted using `theoretical_latency_simulator.py` The script can easily be modified to simulate models with different architectures.