Abstract: We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for up to 3 epochs1
. Building on the architecture and tokenizer of Llama
2 (Touvron et al., 2023b), TinyLlama leverages various advances contributed by the
open-source community, e.g., FlashAttention (Dao, 2023) and Lit-GPT (LightningAI, 2023), achieving better computational efficiency. Despite its relatively small
size, TinyLlama demonstrates remarkable performance in a series of downstream
tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at
https://github.com/jzhang38/TinyLlama
Loading