# Models

All models used are available on Huggingface

| Model | Link |
| --- | --- | 
| Qwen3-0.6B | https://huggingface.co/Qwen/Qwen3-0.6B|
|Qwen3-1.7B | https://huggingface.co/Qwen/Qwen3-1.7B | 
|Qwen3-4B | https://huggingface.co/Qwen/Qwen3-4B |
|Qwen3-8B | https://huggingface.co/Qwen/Qwen3-8B |
|Qwen3-14B | https://huggingface.co/Qwen/Qwen3-14B |
|Qwen2.5-0.5B-Instruct | https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct |
|Qwen2.5-1.5B-Instruct | https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct |
|Qwen2.5-3B-Instruct | https://huggingface.co/Qwen/Qwen2.5-3B-Instruct |
|Qwen2.5-7B-Instruct | https://huggingface.co/Qwen/Qwen2.5-7B-Instruct |
|Llama-3.1.8B | https://huggingface.co/meta-llama/Llama-3.1-8B |
|Llama-3.1.8B-Instruct | https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct |
|Llama-3.2-3B | https://huggingface.co/meta-llama/Llama-3.2-3B |
|LLama-3.2-3B-Instruct | https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct |
|FineMath-Llama 3B | https://huggingface.co/HuggingFaceTB/FineMath-Llama-3B |
|Apertus-8B | https://huggingface.co/swiss-ai/Apertus-8B-2509 |
|Apertus-8B-Instruct | https://huggingface.co/swiss-ai/Apertus-8B-Instruct-2509 |
|SmolLM3-3B | https://huggingface.co/HuggingFaceTB/SmolLM3-3B |

## Tokenizer for Non-Instruct Models
For Llama-3.1 8B, Llama-3.2-3B, FineMath-Llama 3B, and Apertus-8B, we used the tokenizer of the respective instruct model
