# Pump Scheduling

This repository serves as a testbed for Deep Reinforcement Learning (DRL) applied to pump scheduling in a real-world water distribution system. It includes a simulator, datasets, and implementations of offline RL algorithms to facilitate research and experimentation.

## Table of Contents
- [Abstract](#abstract)
- [Getting Started](#getting-started)
  - [Recommended Environment](#recommended-environment)
  - [Generating Transitions and Training a Policy](#generating-transitions-and-training-a-policy)
  - [Evaluating the Trained Policy](#evaluating-the-trained-policy)
- [Data Organization](#data-organization)
  - [Data Collection](#data-collection)
  - [Folder Hierarchy and Data Variables](#folder-hierarchy-and-data-variables)
- [Code Organization](#code-organization)

## Abstract

Deep Reinforcement Learning (DRL) excels in complex environments like games, leveraging scalability to handle high-dimensional state spaces. However, its application to real-world control tasks remains limited due to challenges such as unknown reward functions and complex dynamics. This repository bridges this gap by providing a testbed for pump scheduling in a real-world water distribution facility. The pump scheduling problem involves deciding when to operate pumps to supply water while minimizing electricity consumption and ensuring system safety. The repository includes:

- A simulator for the water distribution system.
- Real-world operational data (demonstrations).
- A well-documented codebase with baseline DRL implementations.

The code and datasets are available in this repository to support researchers and practitioners in applying DRL to real-world control tasks.

## Getting Started

Follow these steps to set up the environment, generate transitions, train a policy, and evaluate performance.

### Recommended Environment

To reproduce the experiments, we recommend the following:

- **Python Version**: 3.9.19
- **Core Libraries**:
  - TensorFlow==2.11.0
  - Keras==2.11.0
- **Additional Packages**:
  - [pandas](https://pandas.pydata.org/)
  - [scikit-learn](https://scikit-learn.org/stable/)
  - [numpy](https://numpy.org/)

Install the dependencies using pip:

```bash
pip install tensorflow==2.11.0 keras==2.11.0 pandas scikit-learn numpy
```

### Generating Transitions and Training a Policy

The simulator generates transitions (`<observation, action, reward, next_observation>`) and trains a policy using offline RL algorithms. Supported algorithms include DDQN, BCQ, REM, and Maxmin Q-learning.

1. **Select an Algorithm**:
   Run the simulator with the desired algorithm:

   ```bash
   python3 simulator_real_data.py --algorithm <ddqn|rem|maxmin>
   ```

   **For BCQ**:
   - Open `ddrqn.py` and set `BCQ = True`.
   - Run:

     ```bash
     python3 simulator_real_data.py --algorithm ddqn
     ```

2. **Outputs**:
   - **Trained Policy**: Saved in the `MODEL` folder.
   - **Transitions**: Stored in `replay_memory.txt`.

3. **Customize Data Generation**:
   To adjust the years used for generating transitions, modify the `YEARS` list in `simulator_real_data.py`. For example:

   ```python
   YEARS = ['2013', '2014']  # Generates data for 2013 and 2014
   ```

4. **Skip Transition Generation**:
   To reuse existing transitions and only train the model, comment out the `self.run()` line in `simulator_real_data.py`:

   ```python
   self.qlearning = algorithm_map[algorithm]()
   # self.run()  # Comment this line to skip transition generation
   self.qlearning.feed_memory()
   self.qlearning.train_model()
   ```

   **Note**: Ensure `replay_memory.txt` exists before skipping generation.

### Evaluating the Trained Policy

Evaluate the trained policy using water consumption data from 2012 (default):

```bash
python3 evaluation.py
```

- **Input**: The simulator uses 2012 water consumption data and the initial tank level from the logged data.
- **Output**: Results are saved in `results_evaluation.txt` and visualized in plots (e.g., `TankLevel_evaluation2012.png`).

To evaluate with a different year, modify the `YEAR` variable in `evaluation.py`:

```python
YEAR = '2013'  # Change to desired year
```

## Data Organization

### Data Collection

- **Years**: 2012, 2013, 2014
- **Location**: Germany
- **Method**: Sensors
- **Format**: CSV files

### Folder Hierarchy and Data Variables

The data is organized as follows:

- **`Data/{2012|2013|2014}`**: Data for each year.
  - **`Tank1/{Month}`**: Water tank level (`TK1_Hoehe_pval`).
  - **`WaterConsumption/{Month}`**: Water consumption (`Netzverbrauch_pval`).
  - **`NP/NP{#}/{Month}`**: Pump operating data for pump # (1–4).
    - **`KW/{Month}`**: Electricity consumption (`NP_{#}_SIMEAS_P_Leistung_pval`).
    - **`Q/{Month}`**: Water flow (`NP_{#}_Volumenfluss_pval`).
    - **`H/{Month}`**: Hydraulic head (`NP_{#}_Druck_Druckseite_pval`).

**Example**:
- `Data/2012/WaterConsumption/März.csv`: Water consumption for March 2012.
- `Data/2013/NP/NP1/Q/Januar.csv`: Water flow for pump 1 in January 2013.

**Note**: Month names use German spellings (e.g., "März" for March, "Januar" for January).

## Code Organization

All code is written in **Python 3**. Below is a description of each file:

- **`simulator_real_data.py`**:
  - Generates transitions (`<observation, action, reward, next_observation>`) based on logged data actions.
  - Stores transitions in `replay_memory.txt`.
  - Supports multi-year data (1-minute timesteps, may take time to process).
  - Trains the selected RL algorithm.

- **`prioritized_replay.py`**:
  - Implements [Prioritized Experience Replay](https://arxiv.org/abs/1511.05952).

- **`ddrqn.py`**:
  - Implements [Double Deep Q-Networks (DDQN)](https://arxiv.org/abs/1509.06461) with [Deep Recurrent Q-Learning](https://arxiv.org/abs/1507.06527).
  - Supports [Batch-Constrained Q-learning (BCQ)](https://arxiv.org/abs/1910.01708) when `BCQ = True`.

- **`bcq.py`**:
  - Provides auxiliary methods for [Batch-Constrained Q-learning (BCQ)](https://arxiv.org/abs/1910.01708), including action classification based on logged data.

- **`rem.py`**:
  - Implements [Random Ensemble Mixture (REM)](https://arxiv.org/abs/1907.04543).

- **`maxmin.py`**:
  - Implements [Maxmin Q-learning](https://arxiv.org/abs/2002.06487).

- **`evaluation.py`**:
  - Evaluates the trained policy using the simulator and one year of unseen water consumption data (default: 2012).
  - Outputs results in text files and plots.

- **`real_data_statistics.py`**:
  - Analyzes real-world operational data through the simulator.
  - Generates text files and plots for comparison with trained policies.

- **`vsp_sketch.py`**:
  - A preliminary implementation of variable speed pumps (see the associated paper for details).

## Licensing

<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.

## Contact

You can find support or send contributions to this project by contacting henrique dot donancio at inria dot fr

## Project status
Submitted to the 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Track on Datasets and Benchmarks
