Keywords: Pinyin Input Method, IME, LLM, RWKV, Ladder Side-Tuning
TL;DR: This paper introduces AttnInput, a novel Pinyin IME that leverages the RWKV language model to achieve state-of-the-art performance on pinyin input.
Abstract: The Pinyin Input Method Engine (IME) is widely used for inputting Chinese characters, but effectively integrating it with powerful large language models (LLMs) remains a challenge due to issues such as semantic discontinuity and inefficient training. This paper presents AttnInput, a novel approach that leverages the strengths of the RWKV language model, specifically its linear computational complexity and "infinite" context length, to enhance Pinyin IME. Our method integrates Pinyin information directly into the internal state of RWKV through a lightweight side network, effectively addressing the semantic discontinuity issue faced by previous LLM-based IMEs. Furthermore, AttnInput utilizes a pre-training strategy, significantly reducing training data and computational costs compared to previous methods. Experimental results demonstrate that AttnInput achieves state-of-the-art performance on abbreviated Pinyin input, especially as the Pinyin sequence length increases. This efficient design allows us to scale up to larger models and incorporate longer contexts, further improving accuracy and user experience.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6969
Loading