Active Object Discovery and Localization Using Sound-Induced Attention

Huaping Liu, Feng Wang, Di Guo, Xinzhu Liu, Xinyu Zhang, Fuchun Sun

2021 (modified: 04 Nov 2022)IEEE Trans. Ind. Informatics 2021Readers: Everyone

Abstract: Industrial intelligent devices are usually equipped with both microphones and cameras to perceive and understand the physical world. Though visual object detection technology has achieved a great success, its combination with other sensing modalities remains unsolved. In this article, we establish a novel sound-induced attention framework for the visual object detection, and develop a two-stream weakly supervised deep learning architecture to combine the visual and audio modalities for localizing the sounding object. A dataset is constructed from the Audio Set to validate the proposed method and some realistic experiments are conducted to demonstrate the effectiveness of the proposed system.

0 Replies