# DATE-GFN Repository Structure

This document explains the organization and purpose of each component in the DATE-GFN research repository.

## Directory Overview

```
DATE-GFN-Research/
├── 📄 README.md                    # Main documentation
├── 📄 LICENSE                      # MIT license
├── 📄 requirements.txt             # Python dependencies
├── 📄 setup.py                     # Package installation
├── 📄 REPOSITORY_STRUCTURE.md      # This file
│
├── 📂 src/                         # Core source code
│   └── date_gfn/                   # Main package
│       ├── core/                   # Core algorithms
│       ├── baselines/              # Baseline methods
│       ├── environments/           # Test environments
│       ├── metrics/                # Performance metrics
│       └── utils/                  # Utilities
│
├── 📂 experiments/                 # Research experiments
│   ├── rq1_comparative/            # RQ1: Method comparison
│   ├── rq2_robustness/             # RQ2: Lambda ablation
│   ├── rq3_efficiency/             # RQ3: Sample efficiency
│   ├── rq4_exploration/            # RQ4: Exploration control
│   ├── rq5_diversity/              # RQ5: Population diversity
│   ├── rq6_scalability/            # RQ6: Problem scaling
│   └── robustness_analysis/        # Additional robustness tests
│
├── 📂 examples/                    # Usage examples
│   └── hypergrid/                  # Hypergrid examples
│
├── 📂 scripts/                     # Utility scripts
│   ├── setup_environment.sh        # Automated setup
│   ├── run_comparative_analysis.py # Method comparison
│   └── ...                        # Other utilities
│
├── 📂 configs/                     # Configuration files
│   ├── base_config.yaml           # Default settings
│   └── hypergrid_config.yaml      # Hypergrid environment
│
│
├── 📂 tests/                       # Unit tests
    ├── test_core/                  # Core algorithm tests
    ├── test_environments/          # Environment tests
    └── test_baselines/             # Baseline tests


```

## Core Components

### `src/date_gfn/` - Main Package

**Purpose**: Core implementation of DATE-GFN and supporting components

- **`core/`**: Main algorithms

  - `date_gfn.py`: DATE-GFN implementation
  - `distillation_aware_fitness.py`: Fitness function
  - `evolutionary_algorithm.py`: EA components

- **`baselines/`**: Comparison methods

  - `gfn_baseline.py`: Standard GFlowNet (TB)
  - `egfn_baseline.py`: Evolution Guided GFlowNet
  - `sac_baseline.py`: Soft Actor-Critic
  - `mars_baseline.py`: MCMC baseline

- **`environments/`**: Test environments

  - `hypergrid_environment.py`: Sparse reward navigation
  - `base_environment.py`: Environment interface

- **`metrics/`**: Performance evaluation

  - `performance_metrics.py`: Basic performance measures
  - `diversity_metrics.py`: Diversity calculations
  - `efficiency_metrics.py`: Sample efficiency

- **`utils/`**: Supporting utilities
  - `wandb_utils.py`: Experiment tracking
  - `utils.py`: General utilities
  - `experiment_tracker.py`: Experiment management

## Experiments Directory

### Research Questions

**RQ1**: Comparative Analysis

- Tests DATE-GFN vs all baselines
- Environment: Hypergrid
- Metrics: Performance, diversity, efficiency

**RQ2**: Robustness & Stability

- Lambda (λ) ablation study
- Training stability analysis
- Variance reduction validation

**RQ3**: Sample Efficiency

- Performance vs critic evaluations
- Learning curve analysis
- Efficiency metrics

**RQ4**: Exploration-Exploitation

- Lambda as exploration controller
- Mode discovery timing
- Convergence analysis

**RQ5**: Population Diversity

- Critic population analysis
- Mode collapse prevention
- Diversity maintenance

**RQ6**: Scalability

- Problem difficulty scaling
- Time-to-solution analysis
- Computational complexity

### Robustness Analysis

Additional experiments addressing methodology weaknesses:

- **RQ1**: Computational efficiency (amortized updates)
- **RQ2**: Adaptive lambda scheduling
- **RQ3**: Molecular generation scalability

## Examples Directory

**Purpose**: Demonstrate usage and provide starting points

- `train_date_gfn.py`: Basic training

- **`hypergrid/`**: Hypergrid-specific examples

  - `simple_demo.py`: Method comparison
  - `lambda_ablation.py`: Parameter tuning

## Configuration System

**Purpose**: Centralized parameter management

- **`base_config.yaml`**: Default parameters
- **`hypergrid_config.yaml`**: Environment-specific

Example configuration:

```yaml
training:
  num_steps: 2000
  batch_size: 32
  learning_rate: 1e-3

date_gfn:
  population_size: 16
  teachability_weight: 0.1
  elite_ratio: 0.25
```

## Scripts Directory

**Purpose**: Automation and utility scripts

- **`setup_environment.sh`**: Complete environment setup
- **`run_comparative_analysis.py`**: Method comparison
- **`run_all_experiments.py`**: Full experimental suite
- **`generate_plots.py`**: Result visualization
- **`analyze_results.py`**: Post-experiment analysis

## Usage Patterns

### 1. Method Comparison

```bash
python scripts/run_comparative_analysis.py \
    --environment hypergrid \
    --methods date-gfn,gfn-tb,egfn
```

### 2. Research Question

```bash
python experiments/rq1_comparative/run_rq1.py
```
