# JoyAgents-R1

This repository is the official implementation of the following paper:

**JoyAgents-R1: Accelerating Multi-Agent Evolution Dynamics with Variance-Reduction Group Relative Policy Optimization** 


## Requirements

- Python 3.10
```
conda create --no-default-packages -n choi python=3.8
conda activate choi
```

- [PyTorch](https://www.pytorch.org) is tested on version 2.0.0
```
conda install pytorch==2.0.0 torchvision==0.9.0 cudatoolkit=11.1.1 -c pytorch -c conda-forge
```

- Other packages are listed in `requirements.txt`
```
pip install -r requirements.txt
```


## Evaluation
- To evaluate our models, run:
```eval
cd scripts
python -u predict.py
```
- The resulting file is generated in the `./pred_results` directory.
- The baseline evaluation code is put in `predict_baselins.py`, which contains all the baselines in the paper.
- `memory` directory contains the final memory of each agent after RL training.
- `models` directory contains the final models of each agent after RL training.
- Agent names in the code:
    1. `master`: Master agent. 
    2. `math`: Math agent.
    3. `expert`: Question-answering agent.
    4. `rody`: E-commerce funcion-call agent.
    5. `toolbench`: General funcion-call agent. 


## Results
Our method achieves the following performance on different sub-task test sets:

| Method | Math | QA | E-commerce FC | General FC | Cooperation | Average |
| :-----:| :---: | :---: | :---: | :---: | :---: | :---: |  
| JoyAgents-R1 (Ours) | 68.0 | 22.0 |  48.0 |  76.0 |  6.0 |  44.0 |   


## The service interface
Our code involves several internal service interfaces. We have currently removed the URLs and API keys from the public parts, as well as the code for some service calls. For example:
- `llm_qwen32b`
- `trion20_onnx_embedding_clip`
- `vearch_retrieval`/`call_tool_retrieval`
- `deepseek-v3`/`deepseek-r1`/`gpt-4o` calling (you can use your own API key and URL to call these services)


## Note
The data and models are undergoing internal review and are expected to be made public when the paper is published.