Phonetic-Based Microtext Normalization for Twitter Sentiment AnalysisDownload PDFOpen Website

2017 (modified: 04 Nov 2022)ICDM Workshops 2017Readers: Everyone
Abstract: The proliferation of Web 2.0 technologies and the increasing use of computer-mediated communication resulted in a new form of written text, termed microtext. This poses new challenges to natural language processing tools which are usually designed for well-written text. This paper proposes a phonetic-based framework for normalizing microtext to plain English and, hence, improve the classification accuracy of sentiment analysis. Results demonstrated that there is a high (>0.8) similarity index between tweets normalized by our model and tweets normalized by human annotators in 85.31% of cases, and that there is an accuracy increase of >4% in terms of polarity detection after normalization.
0 Replies

Loading