# Missing values handling

This project is a Python application for missing-value imputation and reproducing the experiments from the [publication](https://).

## knnSampler imputation algorithm

`knnSampler` is a kNN-based method for missing-value imputation with support for multiple imputation and uncertainty quantification (see the [publication](https://)).

## How to cite

If you use knnSampler, please cite the original [publication](https://).

## Running the project

### Prerequisites

This project requires Python 3.8+ and [poetry](https://python-poetry.org/).

1. Clone the repository

```shell
git clone <repository_url>
```

2. Navigate to the project directory

```shell
cd <repository_directory>
```

3. Install dependencies

```shell
poetry install --no-root
```

### Run algorithms

The project uses a self-documented configuration file `assets/config.conf`.   
The default configuration reproduces the results from the [publication](https://) section(s) ... .

```shell
poetry run task main
```

Runs the main imputation pipeline using `assets/config.conf`.

### Benchmark algorithms

The benchmark reproduces results from the [publication](https://) section(s) ... .  
Note: the benchmarking scripts have no dedicated configuration files. To change benchmark settings, edit the top of [benchmark_all.py](./benchmark_all.py) and [benchmark_knnsampler.py](./benchmark_knnsampler.py).

#### Benchmark all algorithms

For comparing imputation algorithms with each other.

```shell
poetry run task benchmark_all
```

#### Benchmark knnSampler

For detailed knnSampler results with different parameter intervals.

```shell
poetry run task benchmark_knnsampler
```

### Code formatting

```shell
poetry run task format
```

### Code linting

```shell
poetry run task lint
```

### Code testing

```shell
poetry run task test
```

### Install code quality git hooks

```shell
poetry run pre-commit install
```