Abstract: There have been several attempts to create an
accurate and thorough emotion lexicon in En-
glish, which identifies the emotional content
of words. Of the several commonly used re-
sources, the NRC emotion lexicon (Moham-
mad and Turney, 2013b) has received the most
attention due to its availability, size, and its
choice of Plutchik’s expressive 8-class emo-
tion model. In this paper we identify a large
number of troubling entries in the NRC lexi-
con, where words that should in most contexts
be emotionally neutral, with no affect (e.g.,
lesbian, stone, mountain), are associated with
emotional labels that are inaccurate, nonsensi-
cal, pejorative, or, at best, highly contingent
and context-dependent (e.g., lesbian labeled
as DISGUST and SADNESS, stone as ANGER,
or mountain as ANTICIPATION). We describe
a procedure for semi-automatically correcting
these problems in the NRC, which includes
disambiguating POS categories and aligning
NRC entries with other emotion lexicons to
infer the accuracy of labels. We demonstrate
via an experimental benchmark that the qual-
ity of the resources is thus improved. We re-
lease the revised resource and our code to en-
able other researchers to reproduce and build
upon results
0 Replies
Loading