# Overview
This repository contains code modifications and training scripts for the LLaMA and Qwen models. The primary focus is on adapting the models for inference by replacing the latest transformers library content, as well as implementing knowledge distillation training for LLaMA.

## Files
### modeling_llama.py
Description: This script contains modifications to the LLaMA model's inference code.
Key Changes: Replaces the existing content from the latest transformers library with custom implementations tailored for LLaMA model inference.
### modeling_qwen.py
Description: This script contains modifications to the Qwen model's inference code.
Key Changes: Replaces the existing content from the latest transformers library with custom implementations tailored for Qwen model inference.
### train_llama.py
Description: This script implements knowledge distillation training for the LLaMA model.
Key Features:
Knowledge distillation to improve model efficiency and performance.
Custom training pipeline for LLaMA.