Abstract: The interest in developing small language models (SLM) for on-device deployment is fast growing.
However, the existing SLM design hardly considers the device hardware characteristics.
Instead, this work presents a simple yet effective principle for SLM design: architecture searching for optimal runtime efficiency before pre-training.
Guided by this principle, we develop PhoneLM SLM family (with 0.5B and 1.5B versions), that acheive the state-of-the-art capability-efficiency tradeoff among those with similar parameter size.
We fully open-source the code, weights, and training datasets of PhoneLM for reproducibility and transparency, including both base and instructed versions.
We also release a finetuned version of PhoneLM capable of accurate Android Intent invocation, and an end-to-end Android demo.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: Language Modeling
Languages Studied: English
Submission Number: 1238
Loading