# Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Contributions of our paper:
- We extend the latest ASR generative error correction benchmark to noise-robust ASR, as well as propose a new Robust HyPoradise (RobustHP) dataset;
- We propose a noise-aware generative error correction (RobustGER) approach to teach LLMs to perform language-space denoising, which achieves a new breakthrough on RobustHP dataset with up to 53.9% relative WER reduction on CHiME-4 subset;

This supplementary material provides:
- Model code (`lit_gpt/robust_ger.py`);
- Inference script (`infer.sh`);
- RobustHP test sets (`hypo_paradise/`);

Due to the **100MB size limit** of supplementary material, we will release the well-trained model weights in final version.

All the resources of this paper, including training scripts, the entire RobustHP dataset and all well-trained models, will be open sourced upon publication to support the community.
