# Adding new models

Currently, you must clone the HELM repostory and modify the HELM code locally to add a new model.

1. Pick a new organization name that does not conflict with any of the organizations for the [existing models](models.md).
2. Define a new [`Model`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/proxy/models.py#L39) describing your model and add it to [`ALL_MODELS` in `helm.proxy.models`](https://github.com/stanford-crfm/helm/blob/main/src/helm/proxy/models.py#L88). Set `Model.group` and `Model.creator_organization` to your organization name. Set `Model.name` to  `"your_organization_name/your_model_name"`.
3. Implement a new [`Client`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/proxy/clients/client.py#L16). The `Client` is responsible for implementing the core logic of your model, and responding to requests. In particular, it needs to implement three methods, `Client.make_request()`, `Client.tokenize()` and `Client.decode()`. Note that despite the name `Client`, it is possible for `Client` to run model inference locally. Refer to [`SimpleClient`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/proxy/clients/simple_client.py#L15) for an example implementation.
4. Modify both [`AutoClient.get_client()`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/proxy/clients/auto_client.py#L56) and [`AutoClient.get_tokenizer_client()`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/proxy/clients/auto_client.py#L143) to return your implementation of `Client` for your organization name.
5. Implement a [`WindowService`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/benchmark/window_services/window_service.py#L24). You will usually want to make your implementation a subclass of [`LocalWindowService`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/benchmark/window_services/local_window_service.py#L15), which will reuse the `Client.tokenize()` and `Client.decode()` methods that you implemented earlier. The subclass should implement the method `LocalWindowService.tokenizer_name()` to return `"your_organization_name/your_model_name"`. The subclass of `LocalWindowService` will also need to implement the methods `LocalWindowService.max_sequence_length()`, `LocalWindowService.max_sequence_length()`, `LocalWindowService.end_of_text_token()`, `LocalWindowService.prefix_token()` and `LocalWindowService.tokenizer_name()`. Refer to [`GPT2WindowService`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/benchmark/window_services/gpt2_window_service.py#L5) for an example implementation.
6. Modify [`WindowServiceFactory.get_window_service()`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/benchmark/window_services/window_service_factory.py#L30) to construct and return your implementation of `WindowService` for your organization name.
7. Add an entry to [`schema.yaml`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/benchmark/static/schema.yaml) describing your model.
8. Update [`contamination.yaml`](https://github.com/stanford-crfm/helm/blob/v0.2.0/src/helm/benchmark/static/contamination.yaml) to indicate if the training set of your model has been contaminated by data from any scenarios.
