Abstract: Highlights•We introduce an audio feature, easily employed in previous studies using spectrogram.•We propose a Visual Adaptive Spectrogram Generation (VASG).•VASG uses audiovisual correspondence of unlabeled video data.•Our model can be applied to spectrogram-based audio tasks without the visual inputs.
Loading