Term clouds as surrogates for user generated speech

Manos Tsagkias, Martha Larson, Maarten de Rijke

2008 (modified: 11 Nov 2022)SIGIR 2008Readers: Everyone

Abstract: User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error robust. An investigation of the use of term clouds as surrogates for podcasts demonstrates that ASR term clouds closely approximate term clouds derived from human-generated transcripts across a range of cloud sizes. A user study confirms the conclusion that ASR-clouds are viable surrogates for depicting the content of podcasts.

0 Replies