Apriel-1.5-15B-Thinker: Achieving Advanced AI Reasoning with Smart Training, Not Just Scale

TLDR: Apriel-1.5-15B-Thinker is a 15-billion parameter open-weights multimodal AI model developed by ServiceNow’s SLAM Lab. It achieves frontier-level performance comparable to much larger models by focusing on an innovative “mid-training” design, which includes depth upscaling, staged continual pre-training with synthetic data, and high-quality supervised fine-tuning. The model excels in both text and vision reasoning benchmarks, demonstrating that advanced AI capabilities can be made accessible and economical for single-GPU deployments without relying on massive scale or complex reinforcement learning. The project is open-source, providing models and training recipes to foster further research.

In a significant stride for artificial intelligence, the SLAM Lab at ServiceNow has unveiled Apriel-1.5-15B-Thinker, a 15-billion parameter open-weights multimodal reasoning model that achieves impressive performance through innovative training design rather than relying solely on massive scale. This development addresses a critical challenge in AI adoption: making frontier-level capabilities accessible and economical for organizations with limited computational resources and strict deployment constraints.

The core philosophy behind Apriel-1.5-15B-Thinker is that thoughtful “mid-training” design can bridge substantial capability gaps without requiring immense computational power. Mid-training, in this context, refers to a combination of continual pre-training and supervised fine-tuning stages. The model’s creators emphasize that this data-centric approach, which avoids complex reinforcement learning or preference optimization, clearly demonstrates the power of their training methodology.

A Three-Stage Training Journey

The development of Apriel-1.5-15B-Thinker involved a progressive three-stage methodology, building upon the existing Pixtral-12B model:

1. Depth Upscaling: Instead of starting from scratch, the team expanded the model’s reasoning capacity by increasing its depth. This involved adding more hidden layers to the decoder and training it on a vast corpus of text data, including high-quality web content, technical literature, and programming code.

2. Staged Continual Pre-training (CPT): This crucial phase was divided into two parts. The first phase focused on developing foundational text and vision understanding, using a diverse dataset that included mathematical and scientific reasoning, coding, and multimodal data like document and chart understanding. The second phase specifically enhanced visual reasoning. This was achieved through targeted synthetic data generation, which helped the model learn spatial structure, compositional understanding, and fine-grained perception. This synthetic data included tasks like image reconstruction, visual matching, object detection, and counting, with difficulty modulated to ensure robust learning.

3. High-Quality Supervised Fine-Tuning (SFT): The final stage involved fine-tuning the model on a meticulously curated dataset of instruction-response pairs. A key aspect here was the inclusion of explicit reasoning traces in each response, allowing the model to learn transparent thought processes across domains such as mathematics, coding, science, and tool use. The data curation process involved rigorous filtering, de-duplication, and verification using LLM-as-Judge and execution-based methods to ensure the highest quality.

Performance That Rivals Larger Models

Apriel-1.5-15B-Thinker has shown remarkable performance, especially considering its compact size. On the Artificial Analysis Intelligence Index, an independent metric for evaluating general intelligence in large language models, the model achieved a score of 52. This score matches that of DeepSeek-R1-0528, a model that typically requires significantly more computational resources.

Across ten image benchmarks, Apriel-1.5-15B-Thinker’s performance was, on average, within five points of leading proprietary models like Gemini-2.5-Flash and Claude Sonnet-3.7. This is a significant achievement for a model designed to operate within the constraints of a single-GPU deployment.

Specific benchmark highlights include:

88% on AIME’25 (mathematical reasoning)
62% on IFBench (instruction following)
68% on τ2-Bench Telecom (specialized domain tasks)
70.2% on MMMU (general multimodal reasoning)
75.5% on MathVista (mathematical reasoning in visual contexts)
88.2% on CharXiv descriptive tasks (document understanding)

These results underscore the model’s broad reasoning competence across various domains, demonstrating that smaller, efficiently trained models can indeed close the gap with frontier models. The paper highlights that Apriel-1.5-15B-Thinker occupies the “most attractive quadrant” in terms of performance-to-scale, offering a superior cost-to-intelligence trade-off.

Also Read:

Contributions to Open-Source AI

The SLAM Lab at ServiceNow is committed to advancing open-source research. They have released the model checkpoint, all training recipes, and evaluation protocols under the MIT license. This open approach aims to democratize access to frontier-level multimodal reasoning and catalyze further research into efficient training methodologies.

The research, led by core contributors Shruthan Radhakrishna, Aman Tiwari, Aanjaneya Shukla, Masoud Hashemi, Rishabh Maheshwary, Shiva Krishna Reddy Malay, Jash Mehta, Pulkit Pattnaik, Saloni Mittal, Khalil Slimi, Kelechi Ogueji, Akintunde Oladipo, Soham Parikh, and Oluwanifemi Bamgbose, offers a compelling vision for the future of AI development, where smart design can overcome the need for sheer scale. You can read the full research paper here: Apriel-1.5-15B-Thinker: Mid-training is all you need.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Apriel-1.5-15B-Thinker: Achieving Advanced AI Reasoning with Smart Training, Not Just Scale

A Three-Stage Training Journey

Performance That Rivals Larger Models

Contributions to Open-Source AI

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates