Apple Unveils Core Language Models for Apple Intelligence

TLDR: Apple has introduced two new multilingual, multimodal foundation language models that power Apple Intelligence features: a compact on-device model optimized for Apple silicon and a scalable server model built on a novel Parallel-Track Mixture-of-Experts (PT-MoE) transformer. These models, trained on diverse and responsibly sourced data, support multiple languages, understand images, and execute tool calls. They are designed with architectural innovations for efficiency and quality, including KV-cache sharing and advanced quantization techniques. A new Swift-centric Foundation Models framework allows developers to integrate these capabilities, while Apple’s Responsible AI principles ensure user privacy and safety.

Apple has unveiled the foundational language models powering its new Apple Intelligence features, marking a significant step in integrating generative AI across its devices and services. This initiative, introduced at the 2025 Worldwide Developers Conference, aims to enhance user experience while prioritizing privacy. The core of this advancement lies in two distinct yet complementary models: a compact on-device model and a powerful server-based model.

Two Models for Diverse Needs

The first model is a roughly 3-billion-parameter on-device model, meticulously optimized for Apple silicon. Its design incorporates architectural innovations like KV-cache sharing, which significantly reduces memory usage and speeds up the time it takes to generate the first token of a response. It also utilizes 2-bit quantization-aware training, a technique that compresses the model while maintaining quality, making it highly efficient for local processing.

The second is a scalable server model, built upon a novel Parallel-Track Mixture-of-Experts (PT-MoE) transformer. This architecture combines track parallelism, sparse computation, and interleaved global–local attention. This sophisticated design allows the server model to deliver high-quality results with competitive cost on Apple’s Private Cloud Compute platform, ensuring robust performance for more complex tasks.

Understanding the World Through Data

Both models are trained on vast, diverse datasets that include responsible web crawling, licensed corpora, and high-quality synthetic data. Apple emphasizes that no private user data or interactions are used in training these foundation models, reinforcing its commitment to privacy. The web crawling strategy, powered by Applebot, focuses on high-quality, diverse content across numerous languages and locales, with careful attention to ethical practices like respecting robots.txt protocols.

To enable visual understanding, the models also incorporate extensive image data. This includes billions of image-text pairs sourced from web crawls, along with over 5 billion synthetically generated image-caption pairs that provide richer, more detailed descriptions. Specialized text-rich image data, such as PDFs, infographics, and charts, are also used to help the models understand text embedded within images, crucial for features like adding events from a flyer to a calendar.

Training and Optimization for Peak Performance

The training process for these models is multi-staged and highly refined. The text tokenizer was expanded to support more languages, increasing its vocabulary size. The vision encoder undergoes a two-stage training process, first with contrastive pre-training and then joint training with an LLM decoder to align image features with the language model’s representation space.

Post-training involves Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). SFT combines human-written demonstrations and synthetic data, focusing on areas like general knowledge, reasoning, text-rich image understanding, multilingual Optical Character Recognition (OCR), and visual grounding. RLHF, using a distributed asynchronous infrastructure, further refines the models based on diverse reward signals, leading to significant improvements in human preferences and reasoning capabilities.

To ensure efficiency without compromising quality, Apple has implemented advanced optimization techniques. The on-device model uses Quantization-Aware Training (QAT) to compress its weights to 2 bits, while the server model employs Adaptive Scalable Texture Compression (ASTC) for 3.56 bits-per-weight compression. To recover any quality loss from this compression, Low-Rank Adaptation (LoRA) adapters are applied and fine-tuned, allowing the models to maintain high performance.

Empowering Developers with a New Framework

A new Swift-centric Foundation Models framework provides developers with direct access to the on-device language foundation model. This framework simplifies the integration of generative AI capabilities through features like guided generation, which allows developers to directly generate rich Swift data structures, and constrained tool calling, which ensures the structural correctness of tool invocations. The framework also offers a state-full session called LanguageModelSession, designed to optimize performance and context management.

Also Read:

Rigorous Evaluation and Responsible AI

Apple conducted extensive evaluations, comparing its models against publicly accessible benchmarks like MMLU, MMMLU, and MGSM, as well as human evaluations across various language and reasoning capabilities. The on-device model performs favorably against comparably sized models, while the server model shows strong performance, though it lags behind much larger proprietary models.

Central to Apple’s approach is its commitment to Responsible AI. This is guided by principles such as empowering users, representing global users, designing with care, and protecting privacy. Safeguards like content filtering, locale-specific evaluation, and a comprehensive safety taxonomy are integrated throughout the development process. User feedback mechanisms are also in place to continuously improve the models and features.

These advancements in Apple’s foundation models are set to unlock a wide range of helpful features across Apple’s software platforms, making powerful AI capabilities accessible to users globally in many languages. For more technical details, you can refer to the full research paper: Apple Intelligence Foundation Language Models Tech Report 2025.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Apple Unveils Core Language Models for Apple Intelligence

Two Models for Diverse Needs

Understanding the World Through Data

Training and Optimization for Peak Performance

Empowering Developers with a New Framework

Rigorous Evaluation and Responsible AI

Gen AI News and Updates

Microsoft Research Unveils Project Gecko to Advance Equitable Multilingual AI for Global Communities

Publishers Association Warns of Generative AI’s Market Devastation Amidst Copyright Battles

TrueBalance Transforms Indian Credit Landscape with Advanced AI for Financial Inclusion

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates