### 🔗 Hugging Face Model Checkpoints

View the ablation distillation datasets on our [HuggingFace](https://huggingface.co/datasets/TuneShift-KD/neurips2025-datasets/tree/main) page.

All models were trained using the TuneShift-KD pipeline and are hosted under the anonymous organization: [**TuneShift-KD**](https://huggingface.co/TuneShift-KD).

#### Cross-Architecture Models

* **GSM8K (Source: Llama2-7b chat, Target: Gemma 7B)**
  [`TuneShift-KD/gemma-gsm8k-lora-target-llama`](https://huggingface.co/TuneShift-KD/gemma-gsm8k-lora-target-llama)

* **MBPP (Source: Gemma 2b, Target: Gemma 7B)**
  [`TuneShift-KD/gemma-mbpp-lora-target-llama`](https://huggingface.co/TuneShift-KD/gemma-mbpp-lora-target-llama)

#### No-Filter Baselines

* [`TuneShift-KD/LLAMA-GSM8K-NO-FILTER`](https://huggingface.co/TuneShift-KD/LLAMA-GSM8K-NO-FILTER)
* [`TuneShift-KD/LLAMA-MBPP-NO-FILTER`](https://huggingface.co/TuneShift-KD/LLAMA-MBPP-NO-FILTER)
* [`TuneShift-KD/GEMMA-NO-FILTER-GSM8K`](https://huggingface.co/TuneShift-KD/GEMMA-NO-FILTER-GSM8K)
* [`TuneShift-KD/GEMMA-NO-FILTER-MBPP`](https://huggingface.co/TuneShift-KD/GEMMA-NO-FILTER-MBPP)

#### Ablation Models

* **Bounds**
  [`TuneShift-KD/llama-upper-lower-bound`](https://huggingface.co/TuneShift-KD/llama-upper-lower-bound)

* **Filtering Ratio Ablations**
  [`TuneShift-KD/llama-ratio`](https://huggingface.co/TuneShift-KD/llama-ratio)

* **Alternate Threshold (1.3)**
  [`TuneShift-KD/llama-1.3`](https://huggingface.co/TuneShift-KD/llama-1.3)

> 🛡️ All models are hosted anonymously to preserve double-blind review integrity.
