TinyLlama: An Open-Source Small Language Model

Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu

Published: 04 Jan 2024, Last Modified: 11 Oct 2024ArxivEveryoneCC BY 4.0

Abstract: We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for up to 3 epochs1 . Building on the architecture and tokenizer of Llama 2 (Touvron et al., 2023b), TinyLlama leverages various advances contributed by the open-source community, e.g., FlashAttention (Dao, 2023) and Lit-GPT (LightningAI, 2023), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at https://github.com/jzhang38/TinyLlama