NAP for high level language identification

Fred S. Richardson, William M. Campbell

Published: 2011, Last Modified: 20 May 2025ICASSP 2011EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Varying channel conditions present a difficult problem for many speech technologies such as language identification (LID). Channel compensation techniques have been shown to significantly improve performance in LID for acoustic systems. For high-level token systems, nuisance attribute projection (NAP) has been shown to per form well in the context of speaker identification. In this work, we describe a novel approach to dealing with the high dimensional sparse NAP training problem as applied to a 4-gram phonotactic LID system run on the NIST 2009 Language Recognition Evaluation (LRE) task. We demonstrate performance gains on the Voice of America (VOA) portion of the 2009 LRE data.