M2D: A Multi-modal Framework for Automatic Medical Diagnosis

Raj Ratn Pranesh, Ambesh Shekhar, Sumit Kumar

19 Oct 2020 (modified: 24 Oct 2020)OpenReview Anonymous Preprint Blind SubmissionReaders: Everyone

Abstract: In this paper, we present M2D: a multimodal deep learning framework for automatic medical condition diagnosis via transfer learning. M2D leverages acoustic and textual features extracted from the audio utterance and the corresponding transcription describing a patient’s medical symptoms. Our model utilizes ResNet-34 to learn audio feature via log mel-spectrogram and BioBERT language model to learn textual feature. We conducted a comparative performance analysis of M2D with baseline models based on textual or acoustic feature.

0 Replies