Abstract: Motivated by methods used in language modeling and grammar induction, we propose the use of pragmatic constraints and perplexity as criteria to filter the unlabeled data used to generate the semantic similarity model. We investigate unsupervised adaptation algorithms of the semantic-affective models proposed in [1, 2]. Affective ratings at the utterance level are generated based on an emotional lexicon, which in turn is created using a semantic (similarity) model estimated over raw, unlabeled text. The proposed adaptation method creates task-dependent semantic similarity models and task-dependent word/term affective ratings. The proposed adaptation algorithms are tested on anger/distress detection of transcribed speech data and sentiment analysis in tweets showing significant relative classification error reduction of up to 10%.
Loading