Abstract: Inspired by the modulated and non-stationary nature of speech signals, this paper proposes a new feature extraction scheme for speech emotion recognition (SER) using cyclostationary spectral analysis (CSA). This spectral analysis discloses the underlying first-order and second-order (hidden) periodicities in emotional speech signals using the estimated spectral correlation function (SCF) via FAM algorithm. Experiments on the Berlin database of emotional speech (EmoDB) show that the proposed scheme using cyclostationary spectral features (CSFs) significantly outperforms state-of-the-art methods in terms of recognition accuracy.
Loading