# LLM Inference, Training and Evaluation

This library project contains common code for inference with, training and evaluation of LLM models.
The goals of the project are to
- speed up inference and training of LLMs
- enable inference and training for models that do not fit onto a single GPU
- share and standardize common evaluation code, e.g. for computing the probability tokens and token sequences


## Setup

IMPORTANT: Clone the project via SSH (rather than HTTPS), since otherwise there will be issues pulling the submodules.
You need to register an [SSH key with Gitlab](https://docs.gitlab.com/ee/user/ssh.html), in order for this to work.

After cloning the repository, run `./setup.sh` to initialize git submodules, install Poetry, the project dependencies, and the pre-commit hooks.

You can pull the latest updates (including from the submodule dependencies) using `git pull --recurse`.

### Dependency Management

The library uses [Poetry](https://python-poetry.org/) to manage dependencies.
Poetry is more explicit than pip about how it keeps track of dependencies and creates virtual environments in away from the actual project, which makes it easier to maintain projects under different environments.
Poetry is installed automatically by running the `setup.sh` script.

### Contributing to the library

The project uses [git pre-commit hooks](https://pre-commit.com/) to maintain code style and integrity, i.e. for code formatting, linting and unit-testing.
Pre-commit hooks are installed automatically by running the `setup.sh` script, and will also run automatically on subsequent commits.

To manually run the pre-commit hooks, execute `poetry run pre-commit run --all-files`.


## Usage

The project is intended to be used as a subcomponent, i.e. dependency, of a parent project.

### Adding the library to a project

Run `git submodule add git-rts@gitlab.ANONYMOUS-ANONYMOUS.org:ns/ml/llms/lib_llm.git <path>` to add the submodule into a project.

You might have to add `sys.path.insert(0, os.path.dirname(__file__) + "/<path_to_lib_llm>/lib_llm")` in the main file of the parent project, in order to make the imports inside the library project work when used by the parent.

### Synchronizing updates in the library

Run `git submodule update --remote` in the parent project or `git pull` inside the submodule directory to fetch the latest changes from the server.
If you have checked out the library as a submodule and made changes that you would like to share, commit and push them from inside the library directory.
Please also see the notes on development below.

### Dependencies

The project is intended to be used with [Poetry](https://python-poetry.org/) to manage dependencies, but pip also works.
To install Poetry, consult [this guide](https://python-poetry.org/docs/master/#installing-with-the-official-installer).

Dependencies are declared inside the `pyproject.toml` file.
When using this library as a dependency for a parent project, just add the library dependencies to those of the parent project.
When using Poetry, you can add the line `lib_llm = { path = "<path_to_library>", develop = true }` inside the parent's `pyproject.toml` file under `[tool.poetry.dependencies]` and then run `poetry install`.
For pip, activate the parent's virtual environment and run `pip install <path_to_library>`.

### Examples

Examples for using the library are provided inside the `examples` directory.
`examples/inference.py` shows how the library can be used to perform inference and to compute token and sequence probabilities.


### Running tests

Tests will run automatically as part of the pre-commit hooks before every commit.
If you want to manually run them (e.g. to see Docker build output), you can do so using `python tests/docker/run_dockerized_tests.py`.

The Docker image for running the tests is rather big because it includes CUDA and Pytorch.
Therefore, when running the tests for the first time, downloading and building the image will take a while.
Subsequent runs will be significantly faster though.
