Reducing the Cost of Spoof Detection Labeling using Mixed-Strategy Active Learning and Pretrained Models
Abstract: Active learning is a powerful method for reducing the amount of labeled training data needed for a machine learning model to learn a task without degrading performance. This is accomplished by iteratively selecting the most informative samples from an unlabeled dataset to be labeled by an oracle (i.e., a human annotator) using an active learning sampling strategy. Pretrained models have been used in recent years as frontends for active learning neural networks to increase efficiency. This work applies active learning with pretrained models to the spoof detection task with the following two goals: 1) the identification of which pretrained speech models and active learning strategies are most effective for the spoof detection task, and 2) the development of an active learning method that selects the optimal sampling strategy from a list of available strategies at each step of the active learning process. This mixed strategy is shown to outperform all individual strategies for the task.
Loading