# DEMO Video

The demo video shows R2R works in real-time. It runs at the right side, with R1-32B on the left side. Red tokens are routed to the LLM, while tokens in normal text are generated by the SLM. It demonstrates our efficient token-level routing method. The demo is recorded by running on two NVIDIA A800-80GB GPUs.

You can reproduce the demo by running the following command with the attacted code.

For R2R:

```
python script/playground/interactive_chat.py --router_path resource/default_router.pt
```

For R1-32B:

```
python script/playground/interactive_chat.py
```

When prompted, input the question, such as:

```
Find the largest possible real part of \[(75+117i)z + \frac{96+144i}{z}\] where $z$ is a complex number with $|z|=4$.
```

