Apple Intelligence Foundation Language Models: Tech Report 2025

Abhishek Sundararajan, Anders Boesen Lindbo Larsen, Adolfo Lopez Mendez, Mohana Prasad Sathya Moorthy, Di Feng

Published: 17 Jul 2025, Last Modified: 15 Apr 2026OpenReview Archive Direct UploadEveryonearXiv.org perpetual, non-exclusive license

Abstract: We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services: (i) a∼3B-parameter on-device model optimized for Apple silicon through architectural innovations such as KV-cache sharing and 2-bit quantization-aware training; and (ii) a scalable server model built on a novel Parallel-Track Mixture-of-Experts (PT-MoE) transformer that combines track parallelism, mixture-of-experts sparse computation, and interleaved global–local attention to deliver high quality with competitive cost on Apple’s Private Cloud Compute platform. Both models are trained on large-scale multilingual and multimodal datasets sourced via responsible web crawling, licensed corpora, and high-quality synthetic data, then further refined with supervised fine-tuning and rein- forcement learning on a new asynchronous platform. The resulting models support several additional languages while understanding images and executing tool calls. In public benchmarks and human evaluations, both the server model and the on-device model match or surpass comparably sized open baselines. A new Swift-centric Foundation Models framework exposes guided generation, constrained tool calling, and LoRA adapter fine-tuning, allowing developers to inte- grate these capabilities with a few lines of code. The latest advancements in Apple Intelligence models are grounded in our Responsible AI approach with safeguards like content filtering and locale-specific evaluation, as well as our commitment to protecting our users’ privacy with innovations like Private Cloud Compute.