# DATR: DDI-Aware Therapeutic Structure Reconstruction for Safer Medication Recommendation

## Data Preparation

### Required Raw Data
Place the following files in `data/input/`:
- `PRESCRIPTIONS.csv` (from MIMIC-III and MIMIC-IV)
- `PROCEDURES_ICD.csv` (from MIMIC-III and MIMIC-IV)
- `DIAGNOSES_ICD.csv` (from MIMIC-III and MIMIC-IV)

### Provided Data Files
- `ATC_CID.xlsx`: ATC hierarchy with corresponding compound CID
- `drug-DDI.csv`: Drug-drug interaction database
- `Druginfo.csv`: CID to SMILES mapping
- `ndc2RXCUI`: NDC to RXCUI mapping
- `RXCUI2atc4`: RXCUI to ATC4 mapping

### Data Processing Pipeline
1. Run the data processing script:
   ```bash
   python data/process.py
   ```
2. Generate molecular graph data:
   ```bash
   python data/get_mol_graph_data.py
   ```

## Model Training
To train the model:
```bash
python main.py
```

## Inference
To run predictions:
```bash
python infer.py
```

## File Structure
```
DATR/
├── data/
│   ├── input/                 # Raw input data (user-provided)
│   │   ├── PRESCRIPTIONS.csv
│   │   ├── PROCEDURES_ICD.csv
│   │   └── DIAGNOSES_ICD.csv
│   ├── ATC_CID.xlsx           # ATC hierarchy and CIDs
│   ├── drug-DDI.csv           # Drug interaction database
│   ├── Druginfo.csv           # CID-SMILES mapping
│   ├── ndc2RXCUI              # NDC-RXCUI mapping
│   ├── RXCUI2atc4             # RXCUI-ATC4 mapping
│   ├── process.py             # Main data processing script
│   └── get_mol_graph_data.py  # Molecular graph generation
├── model.py                   # Model architecture
├── utils.py                   # Utility functions
├── main.py                    # Training script
└── infer.py                   # Inference script
```

## Dependencies
See `requirements.txt` for all Python dependencies. Install with:
```bash
pip install -r requirements.txt
```

Note: This project requires PyTorch with CUDA 11.8 support.

