Generating visual-adaptive audio representation for audio recognition

Published: 01 Jan 2025, Last Modified: 14 May 2025Pattern Recognit. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We introduce an audio feature, easily employed in previous studies using spectrogram.•We propose a Visual Adaptive Spectrogram Generation (VASG).•VASG uses audiovisual correspondence of unlabeled video data.•Our model can be applied to spectrogram-based audio tasks without the visual inputs.
Loading