Abstract: Highlights•An emotion embedding framework for robust emotional speaker recognition is proposed.•Emotional feature extractors is pre-trained to obtain prior emotional representation.•Various emotion embeddings are extracted by decomposing the emotional features.•Self-attention is introduced to measure the importance of emotion embeddings.•Emotional speaker embedding enriches neutral speech with emotional information.
Loading