Title: Zero-Shot LibriSpeech Clean ASR (Speaker-Disjoint)

Problem statement
Build an automatic speech recognition (ASR) model that transcribes short English utterances into text. You are given paired audio-text training data and unlabeled test audio. Speakers in the test set do not appear in training, so your model must generalize to unseen speakers (zero-shot by speaker).

Data
- train.csv: rows of training utterances
  - id: unique identifier
  - audio_path: relative path to the FLAC file within the package
  - transcript: normalized transcription of the audio
- test.csv: rows of test utterances (labels withheld)
  - id: unique identifier
  - audio_path: relative path to the FLAC file within the package
- audio_train/: FLAC audio files for train.csv
- audio_test/: FLAC audio files for test.csv
- sample_submission.csv: a valid submission template with random but valid example transcripts

Notes
- Audio is sourced from LibriSpeech clean subsets. Files and file names have been anonymized to prevent label leakage.
- Speakers are split disjointly between train and test. Expect test speakers that were never seen in training.
- All text is UPPERCASE in the original corpus; you may choose your own normalization.

Submission format
- CSV file with exactly two columns in any order: id, transcript
- One row per test id
- The id set must match test.csv exactly

Evaluation
- Metric: corpus-level Word Error Rate (WER)
  - We compute WER over normalized text: lowercase, hyphens to spaces, remove non-alphanumeric characters except apostrophes, and collapse whitespace.
  - Corpus WER is computed by summing substitutions, deletions, and insertions over all test utterances and dividing by the total number of reference words.
  - Lower is better (0.0 is perfect).

Rules and suggestions
- No external metadata about the speaker identities is provided; do not attempt to infer from original file names.

Files to use
- Use train.csv, test.csv, audio_train/, audio_test/, and sample_submission.csv for your solution.

Good luck and happy modeling!
