Abstract: Generative Adversarial Networks (GANs) have gained a lot of
attention from machine learning community due to their abil-
ity to learn and mimic an input data distribution. GANs consist
of a discriminator and a generator working in tandem playing
a min-max game to learn a target underlying data distribution;
when fed with data-points sampled from a simpler distribution
(like uniform or Gaussian distribution). Once trained, they al-
low synthetic generation of examples sampled from the target
distribution. We investigate the application of GANs to gener-
ate synthetic feature vectors used for speech emotion recogni-
tion. Specifically, we investigate two set ups: (i) a vanilla GAN
that learns the distribution of a lower dimensional representa-
tion of the actual higher dimensional feature vector and, (ii) a
conditional GAN that learns the distribution of the higher di-
mensional feature vectors conditioned on the labels or the emo-
tional class to which it belongs. As a potential practical applica-
tion of these synthetically generated samples, we measure any
improvement in a classifier’s performance when the synthetic
data is used along with real data for training. We perform cross
validation analyses followed by a cross-corpus study.
0 Replies
Loading