Abstract: We describe a method for continuously improving the accuracy of a large-scale medical automatic speech recognizer (ASR) using a multi-step cycle involving several groups of workers. The paper will address the unique challenges of the medical domain, and discuss how automatically created and crowdsourced input data is combined to refine the ASR language models. The improvement cycle helped to decrease the original system's word error rate from 34.1% to 10.4%, which approaches the accuracy of human transcribers trained in medical transcription.
0 Replies
Loading