An Investigation on Data Augmentation and Multiple Instance Learning for Diagnosis of COVID-19 from Speech and Cough Sound

Tomoya Koike; Zhihua Wang; Kun Qian; Bin Hu; Björn W. Schuller; Yoshiharu Yamamoto

An Investigation on Data Augmentation and Multiple Instance Learning for Diagnosis of COVID-19 from Speech and Cough Sound

Tomoya Koike, Zhihua Wang, Kun Qian, Bin Hu, Björn W. Schuller, Yoshiharu Yamamoto

Published: 01 Jan 2023, Last Modified: 13 May 2025ICCE-Taiwan 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Computer audition based approaches for diagnosing COVID-19 can provide a low-cost, convenient, and real-time solution for combating the ongoing global pandemic. In this contribution, we present an investigation on data augmentation and multiple instance learning methods for diagnosis of COVID-19 from speech and cough sound data. We firstly introduce a novel deep convolutional neural network pre-trained on large scale audio data set, i. e., AudioSet. Moreover, we use a multiple instance learning paradigm to address the training difficulties caused by the varied length of the audio instances. Experimental results demonstrate the efficiency of the proposed methods, which can reach a best performance at 75.9 % of the unweighted average recall, surpassing the official baseline single best by 3.0 % and baseline fusion best by 2.0 %.

Loading