<div align="center">

# MM-EUREKA

</div>


<hr>
<div align="center">
<p style="text-align: center;">MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning<p>
</div>


## 📊MMK12
We release our constructed MMK12 dataset anonymously at [HuggingFace](https://huggingface.co/datasets/MMEureka123/MMK12-Dataset) 

## 🎯Overview

We present **MM-Eureka-Qwen-7B** and **MM-Eureka-Qwen-32B**, both are powerful multimodal reasoning models that successfully extend large-scale rule-based reinforcement learning (RL) to multimodal reasoning. Compared to the previous version of MM-EUREKA based on InternVL, we have made improvements in model architecture, algorithms, and data. For instance, MM-Eureka-Qwen-7B achieves **64.5** on MMK12 evaluation sets, only 0.2 points below InternVL-2.5-78B. On MathVista(testmini), it reaches **73.0**, even surpassing InternVLVL-2.5-78B. MM-Eureka-Qwen-32B demonstrates stronger performance, scoring **72.2** on MMK12 evaluation sets, which exceeds both Qwen2.5-VL-72B's **70.5** and closed-source models like Gemini2-Flash, ranking second only to o1's **73.9**. On commonly used multimodal mathematical reasoning benchmarks, MM-Eureka-Qwen-32B achieves **73.4** on WeMath, outperforming all open-source models and most closed-source models including Claude3.7 Sonnet. On MathVista, it reaches **74.8**, surpassing all open-source and closed-source models. Both variants demonstrate significant improvements in multidisciplinary K12 and mathematical reasoning performance, outperforming most open-source models of similar sizes.



## 🏁 Getting Started

### 📦 Installation

```shell
git clone ...
git checkout qwen
cd MM-EUREKA
pip install -e .[vllm]
pip install flash_attn --no-build-isolation
```



#### Custom dataset

For custom dataset, format your data in to a JSONL file, where each entry is a dictionary organized in the following format.

```json
{
  "id": "0",
  "message": "[{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": \"file:///path/to/your/image.jpg\"}, {\"type\": \"text\", \"text\": \"How many cats in the image?\"}]}]",
  "answer": "gt that could be parsed and verified by math_verify"
}
```

### 🌐 Start Training

Before starting your own training, ensure that the paths in the provided training scripts are correctly set and that environment variables like `$MASTER_ADDR` and `$NODE_RANK` are properly configured.

**start MM-Eureka-Qwen-7B training**

- for single node

  ```shell
  sh examples/scripts/train_mm_eureka_qwen_7b_single_node.sh
  ```

- for multiple node

  ```shell
  sh examples/scripts/train_mm_eureka_qwen_7b_multi_node.sh
  ```



