# vllm-w-lateStopping

To speed up the inference process on memory-writes, we modified the vLLM framework and added late stopping to it. So for running the codes that are in the MemLLM/inference/fast-indexing directory you are required to install this library from the source code that we provided here:

```
cd ./vllm-w-lateStopping
pip install -e .
```

