# SpeeCheck: Self-Contained Integrity Verification via Embedded Acoustic Fingerprints

This repository contains the implementation of **SpeeCheck**, a system for proactive speech integrity verification using embedded acoustic fingerprints.
![system design](./figs/system_design.png)

## Requirements

To install the dependencies:

```bash
pip install -r requirements.txt
```

## Datasets
To download dataset:
* [VoxCeleb](https://mm.kaist.ac.kr/datasets/voxceleb/)
* [LibriSpeech](https://www.openslr.org/12)

Note: 
All speech data must be resampled to 16 kHz mono.
For LibriSpeech, you can convert the dataset using the provided script:

```bash
bash ./convert_libri.sh
```

## Preprocessing
Before training, preprocess the dataset:
```bash
python preprocessing.py \
    --data_dir [DATA_DIR] \
    --save_dir [OUTPUT_DIR] \
    --num_gpus [NUM_GPUS] \
    --num_samples [NUM_SAMPLES]
```

Example:
```bash
python preprocessing.py \
    --data_dir ../vox_train \
    --save_dir ../preprocessed_train \
    --num_gpus 4 \
    --num_samples 5000
```

## Training
Train the SpeeCheck model with:
```bash
python fingerprint_training.py \
    --preprocessed_data_dir ../preprocessed_train \
    --checkpoint_dir ./chkpt \
    --temperature 0.05 \
    --batch_size 64
```

## Protecting Speech
After training, SpeeCheck can embed a watermark into speech to ensure its integrity. The original speech is stored in original_wav/, and the protected version will be saved in protected_wav/.
```bash
python SpeeCheck.py \
    --save_dir ./protected_wav \
    --test_dir ./original_wav \
    --checkpoint ./chkpt-t0.05/SpeeCheck_best.pth
```

## Integrity Verification
To evaluate the integrity verification performance of protected speech
```bash
jupyter notebook SpeeCheck_testing.ipynb
```

## End-to-End Real-World Evaluation

Before running integrity verification, the speech must first be **protected** using SpeeCheck. 
That is, the original speech in `realworld_speech/original/` should be processed with:

```bash
python SpeeCheck.py \
    --save_dir ./realworld_speech/protected \
    --test_dir ./realworld_speech/original \
    --checkpoint ./chkpt-t0.05/SpeeCheck_best.pth
```
The protected speech (watermarked version) can then be:
* Shared on social media platforms (e.g., YouTube, TikTok, Facebook, Weibo, WhatsApp), or
* Edited with benign/malicious operations (e.g., compression, resampling, splicing, substitution).

Finally, to verify the integrity of the processed speech:
```bash
python Speech_testing.py \
    --test_dir ./realworld_speech \
    --checkpoint ./chkpt-t0.05/SpeeCheck_best.pth \
    --threshold 42 \
    --sample_rate 16000
```

👉 For a real-world evaluation demo, please refer to the [demo webpage](https://speecheck.github.io/SpeeCheck/)
