# Code for UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning





## Setup

```shell
conda create -n ui-s1 python=3.11
conda activate ui-s1
cd ui-s1
pip install -e .
pip install vllm==0.8.2
pip install flash-attn==2.7.4.post1 --no-build-isolation

```
We use swanlab for training visulization. Replace your own swanlab api key and host in verl/utils/tracking.py

## Data

1. Download AndroidControl into datasets/AndroidControl/images and datasets/android_control_train_example.jsonl


## Train

```shell
bash scripts/train_example.sh
```

## Inference and evaluation


```shell
# 1. Launch the vLLM server

vllm serve /checkpoints-7B --served-model-name UI-S1-7B --tensor_parallel_size 1 --trust-remote-code --limit-mm-per-prompt image=2

# 2. Evaluate UI-S1-7B's performance on SOP
python /evaluation/eval_qwenvl.py --model_name UI-S1-7B

# Evaluate other models
python /evaluation/eval_qwenvl.py --model_name Qwen2.5-VL-7B
python /evaluation/eval_agentcpm.py --model_name AgentCPM-GUI-8B
python /evaluation/eval_os-atlas-7b.py --model_name OS-Atlas-7B
python /evaluation/eval_os-genesis-7b.py --model_name OS-Genesis-7B
python /evaluation/eval_ui-tars-7b.py --model_name UI-TARS-7B
```