# Implementation Guide: GATv2-NS3 Hybrid IDS

## Mission Statement
Implement a hybrid Graph Neural Network + Network Simulation framework for enterprise intrusion detection. Your goal is to create a working system that combines GATv2 with NS-3 simulation feedback for improved attack detection and explainable AI.

## Research Context
This implements research on using network simulation to enhance GNN training for cybersecurity. The core innovation is a feedback loop where NS-3 simulations validate and improve GNN attention mechanisms, creating more robust and interpretable intrusion detection.

**📖 For complete research background, see [`research_summary.md`](./research_summary.md)** which provides:
- Detailed problem statement and technical approach
- Comprehensive analysis of both research proposals
- Dataset descriptions and evaluation methodology  
- Links to all research documents in this directory
- Difficulty assessment and implementation strategy

### Research Documents Directory
Before starting implementation, familiarize yourself with these research documents:

| Document | Purpose | Key Information |
|----------|---------|-----------------|
| **[`research_summary.md`](./research_summary.md)** | 📋 **START HERE** - Complete overview | Problem statement, proposals comparison, difficulty assessment |
| **[`gatv2_ns3_hybrid_research.md`](./gatv2_ns3_hybrid_research.md)** | 📚 Technical dossier | Citations, datasets, architectures, implementation details |
| **[`proposal_1.md`](./proposal_1.md)** | 🎯 "Network Chaos as Teacher" | Procedural perturbations, robustness training approach |
| **[`proposal_2.md`](./proposal_2.md)** | ⭐ "Self-Focusing Simulations" | **RECOMMENDED** - Adaptive simulation fidelity |
| **[`system_architecture.md`](./system_architecture.md)** | 🏗️ System design | Module interfaces, data flow, training protocols |

**Reading Order**: Start with `research_summary.md`, then focus on `proposal_2.md` and `system_architecture.md` for implementation details.

## Implementation Priority: Proposal 2 (Self-Focusing Simulations)
Focus on **Proposal 2** as it's more technically feasible while still novel:
- Implement dynamic simulation fidelity based on model uncertainty
- Use attention entropy to trigger detailed NS-3 analysis
- Create efficient resource allocation for targeted forensic investigation

## Core System Architecture

### 1. Data Pipeline (Priority: HIGH)
```python
# Target Interface
class CiscoDataset(torch.utils.data.Dataset):
    def __init__(self, root, split_file, window_size_s=300, step_s=60)
    def __getitem__(self, idx) -> Data:
        # Returns: x [N, F], edge_index [2, E], edge_attr [E, Fe], 
        #          y_node [N], graph_id, window_idx
```

**Implementation Steps:**
1. Download Cisco dataset from UCI ML Repository (CC BY 4.0 license)
2. Implement temporal windowing (300s windows, 60s stride)
3. Extract node features: degree, clustering coeff, protocol distribution, bytes in/out
4. Extract edge features: bytes, packets, inter-arrival stats, duration, flow count
5. Create train/val/test splits by enterprise (14/4/4 graphs)

### 2. GATv2 Backbone (Priority: HIGH)
```python
# Target Interface  
class GATv2IDS(nn.Module):
    def __init__(self, in_dim_node, in_dim_edge, hidden=128, layers=3, heads=4)
    def forward(self, data) -> dict:
        # Returns: node_logits [N, 2], edge_attn [E], node_emb [N, H]
```

**Implementation Steps:**
1. Use PyTorch Geometric GATv2Conv with edge attributes
2. Add attention explanation heads (linear + sigmoid over attention coefficients)
3. Implement attention entropy monitoring for uncertainty detection
4. Create sparsity regularization for interpretable attention masks

### 3. NS-3 Integration Layer (Priority: MEDIUM)
```python
# Target Interface
class NS3Client:
    def run_scenario(self, scenario_spec) -> sim_report
    
class SimCache:
    def lookup(self, graph_id, subgraph_id, scenario_id) -> sim_report
    def store(self, graph_id, subgraph_id, scenario_id, sim_report)
```

**Implementation Strategy - Choose ONE:**

**Option A (Recommended): Offline Cache + Surrogate**
1. Pre-generate ~2k simulation scenarios per enterprise graph
2. Train MLP surrogate to predict KPIs from graph features
3. Use surrogate during training, validate with real NS-3 periodically

**Option B (Advanced): In-the-loop Micro-sim**
1. Use ns3-gym or ns3-ai for real-time simulation
2. Limit to subgraphs around top-K suspicious edges (K=50)
3. Cache results by scenario hash

### 4. Simulation Feedback System (Priority: MEDIUM)
```python
# Target Interface
class SimFeedback:
    def propose_scenarios(self, data, edge_attn, num_cf=3) -> scenario_specs
    def eval_scenarios(self, scenario_specs) -> kpi_deltas  
    def compute_losses(self, edge_attn, kpi_deltas) -> dict
```

**Key Components:**
1. **Scenario Generator**: Create counterfactuals (link failures, delay changes, QoS toggles)
2. **KPI Extraction**: Measure latency, throughput, drop rates, jitter
3. **Fidelity Loss**: Align attention weights with KPI sensitivity using Spearman correlation

### 5. Multi-Objective Training (Priority: HIGH)
Implement loss function:
```
L = L_cls + λ_fid * L_fid + λ_sim * L_sim + λ_sp * L_sp
```

Where:
- `L_cls`: Weighted cross-entropy for attack detection
- `L_fid`: Explanation fidelity vs simulation counterfactuals  
- `L_sim`: KPI reconstruction consistency
- `L_sp`: Sparsity regularization for attention masks

**Training Schedule:**
1. Epochs 1-5: Warm-up with only L_cls
2. Epochs 6-10: Gradually introduce L_fid and L_sim
3. Epochs 11+: Full multi-objective training with adaptive chaos

## Implementation Phases

### Phase 1: Foundation
- [ ] Set up PyTorch Geometric environment
- [ ] Download and preprocess Cisco dataset
- [ ] Implement basic GATv2 model with attention explanations
- [ ] Create train/val/test data loaders
- [ ] Implement baseline training loop with L_cls only

### Phase 2: Simulation Integration
- [ ] Set up NS-3 environment with ns3-gym
- [ ] Implement scenario generation for counterfactuals
- [ ] Create simulation cache and surrogate model training
- [ ] Add KPI extraction and fidelity loss computation
- [ ] Test simulation feedback loop on small subgraphs

### Phase 3: Full System
- [ ] Implement multi-objective loss with all components
- [ ] Add attention entropy monitoring and adaptive simulation
- [ ] Create visualization tools for attention overlays
- [ ] Implement scalability optimizations (<100ms explanations)
- [ ] Add adversarial robustness evaluation

### Phase 4: Evaluation
- [ ] Implement baseline comparisons (GraphSAGE, GIN, RF, XGBoost)
- [ ] Run ablation studies (remove simulation feedback)
- [ ] Measure scalability (100-10k nodes) and latency
- [ ] Generate case studies with attention visualizations
- [ ] Document results and create reproducible artifacts

## Technical Requirements

### Dependencies
```bash
# Core ML stack
pip install torch torchvision torch-geometric torch-scatter torch-sparse
pip install scikit-learn xgboost pandas numpy matplotlib seaborn

# NS-3 integration  
pip install ns3-gym  # or build ns3-ai from source
pip install networkx plotly

# Utilities
pip install tqdm wandb hydra-core omegaconf
```

### Hardware Requirements
- **Minimum**: 16GB RAM, CUDA-capable GPU (8GB+ VRAM)
- **Recommended**: 32GB RAM, A100/V100 GPU, SSD storage
- **NS-3**: Linux environment preferred (Ubuntu 20.04+)

### Performance Targets
- **Accuracy**: >90% recall on attack detection
- **Speed**: <100ms for explanation generation
- **Scalability**: Handle graphs with 10k+ nodes
- **Robustness**: <15% performance gap vs black-box baselines

## Key Implementation Notes

### Critical Success Factors
1. **Start Simple**: Begin with offline simulation cache, not real-time integration
2. **Validate Early**: Test attention-KPI correlation on small examples first  
3. **Modular Design**: Keep GNN, simulation, and feedback components decoupled
4. **Performance First**: Optimize for speed before adding complexity

### Common Pitfalls to Avoid
1. **Simulation Complexity**: Don't over-engineer NS-3 scenarios initially
2. **Loss Balancing**: Tune λ weights carefully - start with equal weighting
3. **Memory Issues**: Use sparse tensors and gradient accumulation for large graphs
4. **Reproducibility**: Fix all random seeds and log hyperparameters

### Debugging Strategies
1. **Synthetic Data**: Test on simple synthetic graphs with known ground truth
2. **Ablation Testing**: Remove components systematically to isolate issues
3. **Visualization**: Plot attention weights, KPI correlations, loss curves
4. **Logging**: Use structured JSON logging for all metrics and hyperparameters

## Expected Deliverables

### Code Structure
```
gatv2_ns3_ids/
├── data/
│   ├── cisco_dataset.py          # Data loading and preprocessing
│   └── synthetic_labels.py       # Malicious node injection
├── models/
│   ├── gatv2_ids.py              # Main GNN model
│   ├── explainers.py             # Attention-based explanations
│   └── baselines.py              # GraphSAGE, GIN, RF, XGBoost
├── simulation/
│   ├── ns3_client.py             # NS-3 integration layer
│   ├── scenario_gen.py           # Counterfactual generation
│   ├── sim_cache.py              # Caching and surrogate models
│   └── feedback.py               # Simulation feedback system
├── training/
│   ├── multi_objective.py        # Loss functions and training
│   ├── curriculum.py             # Training schedules
│   └── evaluation.py             # Metrics and benchmarking
├── visualization/
│   ├── attention_viz.py          # Attention overlay plots
│   └── case_studies.py           # What-if analysis
├── experiments/
│   ├── configs/                  # Hydra configuration files
│   ├── scripts/                  # Training and evaluation scripts
│   └── notebooks/                # Analysis and visualization
└── tests/
    ├── test_data.py              # Unit tests for data pipeline
    ├── test_models.py            # Unit tests for models
    └── test_simulation.py        # Unit tests for NS-3 integration
```

### Documentation
- [ ] API documentation with docstrings
- [ ] Installation and setup guide  
- [ ] Training and evaluation tutorials
- [ ] Case study examples with visualizations
- [ ] Performance benchmarking results

### Experimental Artifacts
- [ ] Trained model checkpoints
- [ ] Evaluation metrics and plots
- [ ] Attention visualization examples
- [ ] Simulation scenario cache
- [ ] Reproducibility scripts with fixed seeds

## Success Metrics

### Technical Validation
- [ ] Model achieves >90% recall on Cisco dataset
- [ ] Explanation generation completes in <100ms
- [ ] System scales to 10k+ node graphs
- [ ] Ablation shows simulation feedback improves performance
- [ ] Attention weights correlate with simulation-derived importance

### Research Contribution
- [ ] Novel integration of GNN + network simulation
- [ ] Demonstrable improvement in explainability fidelity
- [ ] Efficient adaptive simulation resource allocation
- [ ] Comprehensive evaluation against strong baselines
- [ ] Reproducible implementation with clear documentation