Abstract: The ongoing industrial revolution involves an increasing need for human-machine communication in all areas of everyday life. Efficient and flawless human-machine communication becomes a large challenge in crowded environments.The communication scheme used in present robotic and humanoid systems usually does not assume simultaneous interaction with multiple interlocutors (e.g. robot and two talking persons) as well as the associated mechanisms for distinguishing between speakers, which is being simplified to non-overlapping, sequential exchange of questions and answers between two interlocutors. This article presents a novel framework outlines to improve human-machine communication in the humanoid robot area. Concepts of acoustic-visual beamforming are already known, the implementation of which is an acoustic camera. However, according to the authors, combining acoustic and visual perception requires significant changes in the approach to the design of robotic systems in order to be able to take full advantage of the multi-talker capability. The perception of an interlocutor is more natural when a human is able to perceive multi-channel information. This provides the acoustic-visual sensors to be able to support beamforming of audio signal and assign these signals to every interlocutor in the engagement zone of a humanoid robot. This can be useful in a time regime adequate for a conversation.
Loading