Metadata-Version: 2.4
Name: vllm
Version: 0.1.dev102+g1694c95.d20250521.precompiled
Summary: A high-throughput and memory-efficient inference and serving engine for LLMs
Home-page: https://github.com/vllm-project/vllm
Author: vLLM Team
License: Apache 2.0
Project-URL: Homepage, https://github.com/vllm-project/vllm
Project-URL: Documentation, https://vllm.readthedocs.io/en/latest/
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: psutil
Requires-Dist: sentencepiece
Requires-Dist: numpy<2.0.0
Requires-Dist: requests>=2.26.0
Requires-Dist: tqdm
Requires-Dist: blake3
Requires-Dist: py-cpuinfo
Requires-Dist: transformers>=4.45.2
Requires-Dist: tokenizers>=0.19.1
Requires-Dist: protobuf
Requires-Dist: fastapi<0.113.0,>=0.107.0; python_version < "3.9"
Requires-Dist: fastapi!=0.113.*,!=0.114.0,>=0.107.0; python_version >= "3.9"
Requires-Dist: aiohttp
Requires-Dist: openai>=1.52.0
Requires-Dist: uvicorn[standard]
Requires-Dist: pydantic>=2.9
Requires-Dist: prometheus_client>=0.18.0
Requires-Dist: pillow
Requires-Dist: prometheus-fastapi-instrumentator>=7.0.0
Requires-Dist: tiktoken>=0.6.0
Requires-Dist: lm-format-enforcer<0.11,>=0.10.9
Requires-Dist: outlines==0.1.11
Requires-Dist: lark==1.2.2
Requires-Dist: xgrammar>=0.1.6; platform_machine == "x86_64"
Requires-Dist: typing_extensions>=4.10
Requires-Dist: filelock>=3.16.1
Requires-Dist: partial-json-parser
Requires-Dist: pyzmq
Requires-Dist: msgspec
Requires-Dist: gguf==0.10.0
Requires-Dist: importlib_metadata
Requires-Dist: mistral_common[opencv]>=1.5.0
Requires-Dist: pyyaml
Requires-Dist: six>=1.16.0; python_version > "3.11"
Requires-Dist: setuptools>=74.1.1; python_version > "3.11"
Requires-Dist: einops
Requires-Dist: compressed-tensors==0.8.1
Requires-Dist: depyf==0.18.0
Requires-Dist: cloudpickle
Requires-Dist: datasets
Requires-Dist: matplotlib
Requires-Dist: pandas
Requires-Dist: ray[default]>=2.9
Requires-Dist: nvidia-ml-py>=12.560.30
Requires-Dist: torch==2.5.1
Requires-Dist: torchvision==0.20.1
Requires-Dist: xformers==0.0.28.post3; platform_system == "Linux" and platform_machine == "x86_64"
Provides-Extra: tensorizer
Requires-Dist: tensorizer>=2.9.0; extra == "tensorizer"
Provides-Extra: runai
Requires-Dist: runai-model-streamer; extra == "runai"
Requires-Dist: runai-model-streamer-s3; extra == "runai"
Requires-Dist: boto3; extra == "runai"
Provides-Extra: audio
Requires-Dist: librosa; extra == "audio"
Requires-Dist: soundfile; extra == "audio"
Provides-Extra: video
Requires-Dist: decord; extra == "video"
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: project-url
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

## Installation

install inference engine:

```bash
conda create -n engine python=3.12
conda activate engine
export VLLM_COMMIT=635b897246da121238454ed4b2bbc87cb4d4166b
export VLLM_PRECOMPILED_WHEEL_LOCATION=https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
pip install --editable .
```

install the sequence predictor

```bash
cd seq-predictor
conda create -n predictor python=3.10
conda activate predictor
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
pip install -e .
```

## Download datasets and models

download datasets:

```bash
cd datasets
huggingface-cli download --repo-type dataset Brookseeworld/Scropio-dataset --local-dir . --resume-download
```

download models

```bash
cd predictor/seq_predictor/MODELS
huggingface-cli download --resume-download Brookseeworld/Scropio-seq-len-predictor --local-dir .
```

## Pipeline

Note that all following script need to be checked and add special paramter. e.g., file path.


1. first run the seq len predictor

```bash
# run seq predictor server
conda activate predictor
python benchmarks/script/entry_predict.py --dataset sharegpt --model 8b
```

2. then run the engine

   ```bash
   # run engine
   conda activate engine
   python benchmarks/script/entry_serving.py --config benchmarks/config/llama8b-sharegpt/minitest.json

   ```
