A Hybrid Rule/Model-Based Finite-State Framework for Normalizing SMS MessagesOpen Website

2010 (modified: 16 Jul 2019)ACL 2010Readers: Everyone
Abstract: In recent years, research in natural language processing has increasingly focused on normalizing SMS messages. Different well-defined approaches have been proposed, but the problem remains far from being solved: best systems achieve a 11% Word Error Rate. This paper presents a method that shares similarities with both spell checking and machine translation approaches. The normalization part of the system is entirely based on models trained from a corpus. Evaluated in French by 10-fold-cross validation, the system achieves a 9.3% Word Error Rate and a 0.83 BLEU score.
0 Replies

Loading