Correcting Keyboard Layout Errors and Homoglyphs in Queries

Derek Barnes, Mahesh Joshi, Hassan Sawaf

2014 (modified: 16 Jul 2019)EMNLP 2014Readers: Everyone

Abstract: Keyboard layout errors and homoglyphs in cross-language queries impact our ability to correctly interpret user information needs and offer relevant results. We present a machine learning approach to correcting these errors, based largely on character-level n-gram features. We demonstrate superior performance over rule-based methods, as well as a significant reduction in the number of queries that yield null search results.

0 Replies