Context-Aware Input Switching in Mobile Devices: A Multi-Language, Emoji-Integrated Typing System

18 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Context-Aware, ​​Input Switching, Multilingual
Abstract: Multilingual and emoji-integrated typing has become increasingly prevalent in mobile communications, especially among users who frequently switch between languages and expressive symbols in real-time conversations. This phenomenon is particularly pronounced in linguistically diverse regions where code switching, the practice of alternating between multiple languages within discourse, has become integral to natural communication patterns. However, existing mobile input systems largely rely on static language models that fail to capture the dynamic, context-dependent nature of multilingual typing, resulting in suboptimal user experiences characterized by frequent manual switching, prediction errors, and substantial latency overhead. To address these limitations, we introduce CAISS (Context-Aware Input Switching System), a novel neural architecture that revolutionizes multilingual mobile input through predictive language switching. Unlike traditional reactive systems, CAISS proactively anticipates user language transitions by modeling the complex interplay between linguistic patterns, temporal dynamics, application contexts, and social cues using a sophisticated multi-scale attention mechanism suitable for edge deployment. We construct a comprehensive multilingual-emoji typing dataset and evaluate CAISS against commercial baselines across six languages commonly used in code-switching contexts: English, Mandarin Chinese, Cantonese, Malay, Tamil, and Vietnamese. Our experimental results reveal substantial improvements over existing approaches, with CAISS achieving a 23.8\% enhancement in switching accuracy and a remarkable 34.1\% reduction in typing latency. The system's lightweight architecture, comprising only 2.5M parameters, enables real-time inference with sub-10ms latency on contemporary mobile processors while maintaining competitive performance across diverse linguistic and contextual conditions.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 10782
Loading