# IMLP: Context-Aware Incremental Learning for Tabular Data

## Overview

IMLP is a novel neural architecture designed for domain-incremental learning on tabular data with a focus on energy efficiency. It incorporates an attentional "look-back" module that re-uses hidden representations from earlier segments instead of replay buffers, providing both competitive accuracy and superior energy efficiency.

Key features:
- Attention-based rehearsal mechanism for effective feature reuse
- Constant-time updates with a single shared backbone
- Energy-aware evaluation framework with real-time power measurement
- Competitive accuracy compared to GBDTs while significantly reducing energy consumption
- NetScore-T: A joint metric that evaluates both accuracy and energy efficiency

## Installation

### Requirements

This project requires Python 3.10+ and PyTorch. You can install all dependencies with:

```bash
git clone https://github.com/yourusername/imlp
cd imlp
pip install -e .
```

For hardware-specific power measurements, we use an ElmorLabs PMD-USB power meter and PCIe slot adapter. This is optional for running the models but necessary to reproduce our energy measurements.

### Directory Structure

- `configs/`: Configuration files for experiments (hyperparameters for each model)
- `data/`: Scripts for downloading and processing TabZilla/OpenML datasets
- `plotting/`: Visualization tools and critical difference diagrams
- `scripts/`: Utility scripts for running experiments and power measurements
- `src/`: Core implementation
  - `models/`: IMLP and MLP implementations
  - `experiment.py`: Unified training and evaluation framework
  - `energy_monitor.py`: Hardware power monitoring interface
  - `cli.py`: Command-line interface for experiments
- `TabZilla/`: Interface to baseline models from the TabZilla benchmark

## Usage

### Running Experiments

To train and evaluate IMLP on a specific OpenML task:

```bash
python src/cli.py --experiment configs/experiments/imlp.yaml --task <OPENML_TASK_ID> --device cuda
```

For running a complete benchmark across all TabZilla datasets (as in the paper):

```bash
bash scripts/run_experiment.sh
```

### Power Measurement

To enable power measurement during experiments, ensure the ElmorLabs PMD-USB meter is connected to your system. The `energy_monitor.py` class will automatically detect and use the device.

### Results Analysis

To generate performance comparison figures and tables after running experiments:

```bash
cd plotting
# Run the Jupyter notebook
jupyter notebook aggregate_results.ipynb

# Generate critical difference diagrams
cd cd-diagram
python main.py
```

## Reproducing Paper Results

1. **Data Preparation**:
   ```bash
   cd data
   python openml_data_processor.py --task_list openml_import.txt
   ```

2. **Run Experiments**:
   ```bash
   bash scripts/run_experiment.sh
   ```

3. **Generate Figures**:
   ```bash
   cd plotting
   jupyter notebook aggregate_results.ipynb
   cd cd-diagram
   python main.py
   ```

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Acknowledgments

We thank the creators of the TabZilla benchmark and the OpenML community for providing the datasets used in this work.