## Thinking is Seeing: Multi-modal Large Language Models are Exceptional in Understanding Knowledge Graphs

This is the official repository for SeeKG. This project investigates the potential of MLLMs in understanding and reasoning with knowledge graph images, a domain that has received limited exploration.

## Quick Start

### Install Dependencies
We provide the environment configuration in env.yaml. Please first create a virtual environment for SeeKG:
```shell
conda env create -f env.yaml
```

### Download the datasets
We use the textualized datasets provided by [KoPA](https://github.com/zjukg/KoPA). Please download the datasets from [data.zip](https://drive.google.com/file/d/1J1Ioi23jTMaBkBDYzfIy2MAZYMUIjFWW/view), and place all JSON files into the "./data" directory.

### Construct SeeKG dataset on CoDeX-S
Run the following script to generate KG images and the multi-modal CoDeX-S dataset:
```shell
python construct_seeKG_dataset.py --dataset_name CoDeX-S --max_samples 10000 --max_path_length 3 --max_node_num 6
```

### Train SeeKG on CoDeX-S dataset
Run the following script to train SeeKG with the multi-modal CoDeX-S dataset:
```shell
CUDA_VISIBLE_DEVICES=1 llamafactory-cli train configs/qwen25vl-codex.yaml
```

### Evaluate SeeKG with the original MLLM on CoDeX-S dataset
Run the following script to evaluate SeeKG with the multi-modal CoDeX-S dataset using the original MLLM:
```shell
CUDA_VISIBLE_DEVICES=1 python infer_seeKG.py --dataset codex-test --model_name_or_path ../Qwen/Qwen2.5-VL-7B-Instruct --adapter_name_or_path None
```

### Evaluate SeeKG with Lora Adapter
Run the following script to evaluate SeeKG with the trained LoRA Adapter:
```shell
CUDA_VISIBLE_DEVICES=1 python infer_seeKG.py --dataset codex-test --model_name_or_path ../Qwen/Qwen2.5-VL-7B-Instruct --adapter_name_or_path saves/qwen2_5vl-7b/lora/sft/checkpoint-220
```