Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance

Published: 27 Oct 2023, Last Modified: 10 Nov 2023DGM4H NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: synthetic respiratory sound; audio diffusion model; deep generative models; address class imbalance; respiratory sound classification; adversarial fine-tuning;
TL;DR: This paper aims to generate high-fidelity respiratory sound samples via diffusion model, and synthetic samples with real data, and address the data distribution inconsistency between them to improve the respiratory sound classification performance.
Abstract: Deep generative models have emerged as a promising approach in the medical image domain to address data scarcity. However, their use for sequential data like respiratory sounds is less explored. In this work, we propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder. We also demonstrate a simple yet effective adversarial fine-tuning method to align features between the synthetic and real respiratory sound samples to improve respiratory sound classification performance. Our experimental results on the ICBHI dataset demonstrate that the proposed adversarial fine-tuning is effective, while only using the conventional augmentation method shows performance degradation. Moreover, our method outperforms the baseline by 2.24% on the ICBHI Score and improves the accuracy of the minority classes up to 26.58%. For the supplementary material, we provide the code at
Submission Number: 41