# Self-Calibrating BCIs: Ranking and Recovery of Mental Targets Without Labels

Warning: Due to file size limitations, we provide a subset of the data through OpenReview.
While this is on a best-effort basis and should serve as a starting point, we cannot guarantee that all paper results can be exactly reproduced with this subset.
However, the full data (EEGNet embeddings) is available at the ANONYMOUS LINK https://mega.nz/file/DZ0yxZKK#Yud5zIZCyGdzPjqx0Y3jX_RlRQOLS_LzXJPVjiiNVEQ during the review process.
See the section "File size considerations" for more information.

## Usage

Run `setup_experiments.py` to generate the 170 datasets used for statistically-sounds results.
This step takes some time to complete and creates a new directory `./experiments/datasets`.

Then run `run_experiments.py` will run experiments with various user defined combination of conditions and save them in `_RESULTS_DIR`.
This step creates a new directory with many files under `./experiments/results`.

Other `run_*` files will run experiments for other experiments reported in the paper.

Useful functions can be found under `tools/` with names of files and function self-explanatory.

### Tools

Key tools includes:

- `scoring_tools.py` will have CURSOR scoring function with regressors in `scorers.py`
- `dataset_tools.py` helps load data.
- `face_tools.py` help generate images from $z$ vectors.

### H-ID user experiment

`/user_experiments` has all the code to run and save H-ID experiments as a web-app, with results saved as csv.

### Plots

All code to plots results in available in analysis/plots.

We have saved the pandas dataframe with all results from our experiments in `analysis/final_dfs` and all plots load only those files.

Those files can be regenerated using `analysis/gather_results.py` and `analysis/compile_final_df.ipynb`

## File size considerations

Some of the files are too large to be distributed at submission time:

- `data/all_data_sorted.npz` --- The script in `data/trim.ipynb` trims all data to the desired (N), will need to then replace/rename the resulting file as `all_data_sorted.npz`.
  We did not test the full pipeline with all possible data sizes, but it should work as intended.
- `analysis/final_dfs` --- this is all the compiled results from all the runs we performed.
  `optim.parquet` is small and useful enough left to plot some results.

Status files (images) necessary to run the H-ID user study also had to be removed.

## EEG privacy notice and impact on code

All code is setup to run with either EEG_Raw or EEG_Net. We cannot share EEG-Raw for pricavy concerns so, in this code, all `EEG_Raw = EEG_Net`.
To be clear, everytime experiments with EEG_Raw are run, the EEG used will be EEG_Net.
For example, see that in `tools/dataset_tools`, `eeg_raw = eeg_net` and all `eeg_raw` data have been removed from `data/all_data_sorted.npz`

## Setup

This project has been initially set up with `pip` and `rye`, but only tested with `rye` for the final version.
For pip, run `pip install -r requirements_pip.txt` to install all dependencies.
For rye, use `rye sync`, which will use `pyproject.toml` to install all dependencies.

## LICENCE

GPLv3
