A Single-Channel Drone Noise Reduction Algorithm Based on Speech Harmonic Features

Published: 2025, Last Modified: 09 Nov 2025ICIC (9) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Speech enhancement on drone platforms poses significant challenges due to the distinctive noise characteristics inherent to these environments. Traditional speech enhancement models often rely on predefined noise databases to learn noise patterns, simplifying deep neural networks by focusing primarily on amplitude information. This approach neglects the critical roles of phase information, temporal dependencies, and contextual relevance. To address speech enhancement under low signal-to-noise ratio (SNR) conditions, this paper proposes a drone noise reduction algorithm leveraging speech harmonic features. The proposed algorithm employs an encoder-decoder architecture to extract speech features and integrates a dual-path Transformer to capture short-term correlations within speech signals. To improve the accuracy of speech magnitude mask estimation, a gating mechanism that combines harmonic position detection, Voice Activity Detection (VAD), and Voice Region Detection (VRD) is introduced to refine the network's estimated speech magnitude spectrum. Additionally, the algorithm utilizes dual-path multiple refinement iterations with a harmonic attention mechanism to fully exploit the inherent physical structure of speech. Experimental results demonstrate that the proposed method outperforms all compared time-frequency (TF) domain approaches in terms of speech quality metrics.
Loading