# Multi-Agent Knowledge Graph Pipeline

A comprehensive pipeline for extracting, evaluating, and generating QA pairs from academic papers using a multi-agent approach built with LangGraph.

## 🏗️ Architecture

The pipeline consists of three main agents:

1. **Extractor Agent** (`SectionBasedExtractor`)
   - Extracts knowledge graphs from academic papers
   - Processes papers section by section
   - Outputs TTL/RDF format

2. **Evaluator Agent** (`TTLEvaluator`)
   - Evaluates knowledge graph quality
   - Provides scores across multiple dimensions
   - Generates improvement suggestions

3. **QA Generator Agent** (`MultiHopQAGenerator`)
   - Generates multi-hop reasoning questions
   - Creates multiple-choice QA pairs
   - Only runs for high-quality knowledge graphs



## 📋 Prerequisites

### Python Dependencies
```bash
pip install -r requirements.txt
```

## 🚀 Quick Start

1. **Setup Environment**
   ```bash
   cd kg_pipeline
   export OPENAI_API_KEY="your-api-key"
   ```

2. **Prepare Input Data**
   - Place your paper JSON files in the `data/` directory
   - Each file should contain paper content in JSON format

3. **Run Pipeline**
   ```bash
   python run.py
   ```

## ⚙️ Configuration

The pipeline is configured via `config.yaml`:

### Key Configuration Sections

#### API Configuration
```yaml
api:
  openai_api_key: ${OPENAI_API_KEY}
  openai_base_url: ${OPENAI_BASE_URL}
```

```

## 📁 Directory Structure

```
kg_pipeline/
├── agents/
│   ├── extractor/           # Knowledge extraction agent
│   ├── evaluator/           # Knowledge evaluation agent
│   └── QAgenerator/         # QA generation agent
├── data/                    # Input paper files
├── utils/
│   ├── helpers.py          # Utility functions
│   └── storage.py          # State management
├── outputs/                 # Generated outputs
│   ├── section_based_extractions/  # TTL files
│   ├── evaluations/        # Evaluation results
│   ├── multi_hop_qa/       # QA pairs
│   ├── results/            # Batch reports
│   ├── state/              # Workflow state
│   └── logs/               # Log files
├── config.yaml             # Configuration
├── workflow.py             # Main workflow logic
└── run.py         # Entry point script
```