# From Faults to Features: Utilizing Sensor Faults to Learn Robust Representations

This repository contains the code to reproduce the results presented in the paper "From Faults to Features: Utilizing Sensor Faults to Learn Robust Representations".

## Project Structure

```
FaultsToFeatures/
├── configs/                  # Configuration files that define pretraining and finetuning configurations
├── data_preprocessing/       # Contains a file for reproducible preprocessing of the REVS datasets
├── datasets_raw/             # Folder for raw CSV data files downloaded via the provided links
│   └── REVS Program Vehicle Dynamics Database/
│       ├── 2013_Montery_Motorsports_Reunion/
│       ├── 2013_Montery_Motorsports_Reunion_Test/
│       └── 2013_Targa_Sixty_Six/
├── experiments/              # Scripts that reproduce the presented results in the paper and appendix
├── models/                   # Folder for trained model checkpoints
├── src/                      # Source code for pretraining and fine-tuning as well as utils for preprocessing, reproducibility, testing and visualization
├── main.py                   # Entrypoint for pretraining and finetuning models
├── *.job                     # Several Slurm job files to repeat experiments presented in the paper (e.g., reproduce_bm.job, run_experiments.job)
└── README.md
```

## Data Acquisition

The experiments utilize the REVS Program Vehicle Dynamics Database.

*   **2013 Targa Sixty-Six:** Kegelman, John C. and Harbott, Lene K. and Gerdes, J. Christian. (2016). Stanford Digital Repository. Available at: [http://purl.stanford.edu/yf219gg2055](http://purl.stanford.edu/yf219gg2055)
*   **2013 Monterey Motorsports Reunion:** Kegelman, John C. and Harbott, Lene K. and Gerdes, J. Christian. (2016). Stanford Digital Repository. Available at: [http://purl.stanford.edu/tt103jr6546](http://purl.stanford.edu/tt103jr6546)

## Setup and Installation

1.  **Prerequisites:**
    *   Python 3.11.7
    *   `pip` for package management
    *   `virtualenv` (recommended)

2.  **Create Virtual Environment and Install Dependencies:**
    Navigate to the root of the repository in your terminal:
    ```bash
    python3 -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt
    ```

3.  **Download and Organize Data:**
    Create the following directory structure within the `datasets_raw` folder and download the respective CSV files from the links above:
    *   `data_raw/REVS Program Vehicle Dynamics Database/2013_Montery_Motorsports_Reunion/`: Download all CSV files *except* `20130811_02_01_01_grandsport.csv`.
    *   `data_raw/REVS Program Vehicle Dynamics Database/2013_Montery_Motorsports_Reunion_Test/`: Download `20130811_02_01_01_grandsport.csv`.
    *   `data_raw/REVS Program Vehicle Dynamics Database/2013_Targa_Sixty_Six/`: Download all CSV files.

4.  **Run Preprocessing:**
    Execute the preprocessing script once.
    `python data_preprocessing/preprocess_revs.py`

5.  **Configure Device:**
    *   Ensure the correct computation device (e.g., `cuda`, `mps`, `cpu`) is set in the configuration files located in the `configs/` directory.
    *   Verify the device settings in the experiment files located in the `experiments/` folder.

## Reproducing Results with Slurm

*Note: Adjust the virtual environment activation command in the Slurm job files (`.job` files) to fit your cluster setup. Alternatively, all scripts can be executed manually.*

1.  **Pretrain and Finetune Models:**
    Submit the Slurm jobs to pretrain and finetune the models.
    *   `sbatch reproduce_bm.job` - Trains benchmark model with 10 seeds.
    *   `sbatch reproduce_m_b_n.job` - Trains our approach with 10 seeds.
    *   `sbatch reproduce_m.job` - Trains traditional masking with 10 seeds.
    *   `sbatch reproduce_bias_scaling.job` - Pretrains our approach with different bias bounds.
    *   `sbatch reproduce_dmodel_scaling.job` - Pretrains our approach with different model sizes.
    *   `sbatch reproduce_noise_scaling.job` - Pretrains our model with different noise masking strength.
    *   `sbatch reproduce_other_masks.job` - Pretrains and finetunes other combination of sensor failure masks.

2.  **Run Experiments:**
    Execute the main experiments Slurm job to reproduce all paper results (excluding closed-loop evaluations):
    `sbatch run_experiments.job`
    *Note: In the experiment scripts found in the `experiments/` folder, there is a boolean variable `EVAL_NEW` which defaults to `False`. Set this to `True` if you wish to re-run the evaluation steps.*


## Running Training Scripts Manually

Navigate to the root of the repository for all manual training commands. The main entry point is `main.py`.

1.  **Pretraining:**
    *   Ensure a corresponding pretraining configuration file exists in `configs/` (e.g., `configs/YOUR_PT_CONFIG.yaml`).
    *   Run:
        ```bash
        python main.py --config_name configs/YOUR_PT_CONFIG.yaml --mode pretrain
        ```

2.  **Finetuning:**
    *   Ensure corresponding fine-tuning and pretraining configuration files exist in `configs/`.
    *   Run:
        ```bash
        python main.py --config_name configs/YOUR_FT_CONFIG.yaml --mode finetune
        ```

3.  **Benchmark (Supervised Training without Pretraining):**
    *   Ensure a pre-training config (to define encoder architecture) and a fine-tuning config exist.
    *   In the fine-tuning config, set `pretrained_model_path` to `null`.
    *   Run:
        ```bash
        python main.py --config_name configs/YOUR_BM_CONFIG.yaml --mode finetune
        ```