Keywords: input methods, language bias, large language model, back-transliteration
Abstract: Back-transliteration serves a dual purpose: processing informal Romanized text and powering phonetic Input Methods (IMEs). This survey bridges the fragmented research landscape by introducing a "maturity spectrum" that contrasts the sophisticated engines of CJK languages with the foundational models of Arabic and Indic scripts. We systematically analyze the convergent technological evolution—from statistical models to Large Language Models (LLMs). We demonstrate that despite the rise of LLMs, specialized back-transliteration remains essential for latency-sensitive input. The paper concludes with a technical roadmap for building the next generation of universal input systems.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: Ethics, Bias, and Fairness
Contribution Types: Surveys
Languages Studied: English
Submission Number: 10029
Loading