Active Object Discovery and Localization Using Sound-Induced AttentionDownload PDFOpen Website

2021 (modified: 04 Nov 2022)IEEE Trans. Ind. Informatics 2021Readers: Everyone
Abstract: Industrial intelligent devices are usually equipped with both microphones and cameras to perceive and understand the physical world. Though visual object detection technology has achieved a great success, its combination with other sensing modalities remains unsolved. In this article, we establish a novel sound-induced attention framework for the visual object detection, and develop a two-stream weakly supervised deep learning architecture to combine the visual and audio modalities for localizing the sounding object. A dataset is constructed from the Audio Set to validate the proposed method and some realistic experiments are conducted to demonstrate the effectiveness of the proposed system.
0 Replies

Loading