Probabilistic binaural multiple sources localization based on time-delay compensation estimator and clustering analysis

Published: 2016, Last Modified: 16 May 2025IROS 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Sound source localization (SSL) is an essential technique in many applications, such as robot audition, human-robot interaction and speech capturing. However, SSL from a binaural input is still a challenging problem, particularly when multiple sources are active simultaneously. In this work, we propose a multi-sources localization framework based on the time-delay compensation (TDC) estimator and clustering analysis. The TDC estimator is a simultaneous operator to estimate binaural cues, which breaks the limitation of independent processors for binaural cues extraction. The multi-sources decision is realized by clustering analysis for the binaural cues of multiple signal frames. In experiments, we demonstrate that the localization performance is improved compared to the methods that assume the number of spatial stationary sources to be known. Results with both simulated and recorded impulse responses show that robust performance can be achieved with limited prior training, and our method is also adaptive to different sound activities.
Loading