Tiny Networks Outperform Large Language Models on Complex Puzzles Through Simplified Recursive Reasoning

TLDR: Tiny Recursive Model (TRM), a new approach using a single, two-layer neural network with only 7 million parameters, significantly outperforms larger Hierarchical Reasoning Models (HRM) and many Large Language Models (LLMs) on challenging puzzle tasks like Sudoku, Maze, and ARC-AGI. TRM achieves this by simplifying recursive reasoning, eliminating complex theoretical assumptions, using a single network, and optimizing training efficiency, demonstrating that “less is more” for generalization on small datasets.

Large Language Models (LLMs) have made incredible strides, but they often struggle with complex puzzle-solving tasks like Sudoku, Maze navigation, and the ARC-AGI challenges. These problems require deep reasoning, and LLMs, which generate answers step-by-step, can easily make errors that invalidate the entire solution. While techniques like Chain-of-Thought (CoT) and Test-Time Compute (TTC) aim to improve reliability, they can be expensive, require high-quality data, and don’t always guarantee success.

An earlier approach, the Hierarchical Reasoning Model (HRM), offered a promising alternative. HRM used two small neural networks that “recurse” or repeat their operations at different frequencies, inspired by how the brain processes information. Combined with a technique called “deep supervision” (improving answers over multiple steps), HRM showed impressive results on these hard puzzle tasks, even outperforming LLMs with far fewer parameters. However, HRM was quite complex, relying on intricate biological arguments and mathematical theorems that weren’t always perfectly applicable. Its training process also required two forward passes for a feature called Adaptive Computational Time (ACT), making it less efficient.

Introducing Tiny Recursive Models (TRM)

A new research paper, “Less is More: Recursive Reasoning with Tiny Networks,” introduces the Tiny Recursive Model (TRM), a significantly simpler and more effective approach. TRM achieves even higher generalization than HRM, using a single, much smaller network with only two layers. With just 7 million parameters, TRM outperforms many LLMs (like Deepseek R1, o3-mini, and Gemini 2.5 Pro) on ARC-AGI-1 and ARC-AGI-2, using less than 0.01% of their parameters. You can read the full paper here: Less is More: Recursive Reasoning with Tiny Networks.

Simplifying Recursive Reasoning

TRM addresses several complexities of HRM:

No Fixed-Point Theorem Needed: HRM relied on a mathematical theorem to justify only back-propagating through the last few steps of its recursion. TRM bypasses this by back-propagating through the entire recursion process, which, surprisingly, leads to a massive boost in performance without needing complex theoretical assumptions.
Clearer Latent Features: HRM used two latent features, zL and zH, with a hierarchical interpretation based on biological arguments. TRM simplifies this, viewing one feature (y) as the current proposed solution and the other (z) as a latent reasoning feature. This intuitive explanation clarifies why two features are optimal without needing complex biological justifications.
Single, Tiny Network: HRM used two separate networks, doubling its parameter count. TRM demonstrates that a single network is sufficient to perform both tasks of iterating on the latent reasoning and updating the solution, significantly reducing parameters while improving generalization.
Less is More in Layers: Counter-intuitively, TRM found that using fewer layers (2 instead of 4) in its network, while increasing the number of recursions, led to better generalization. This suggests that for tasks with limited data, smaller networks with deep recursion can prevent overfitting.
Efficient Training with Simplified ACT: HRM’s Adaptive Computational Time (ACT) mechanism, designed to speed up training, required an extra forward pass. TRM simplifies ACT by removing the “continue loss,” eliminating the need for this expensive second pass without compromising accuracy.
Enhanced Stability with EMA: For datasets with small amounts of training data, TRM incorporates Exponential Moving Average (EMA) of weights, a technique commonly used in other advanced models, to improve stability and prevent overfitting.

Impressive Performance Gains

TRM shows significant improvements across various benchmarks:

On Sudoku-Extreme, TRM-MLP achieved 87.4% test accuracy, a substantial leap from HRM’s 55.0%.
For Maze-Hard, TRM-Att reached 85.3% accuracy, compared to HRM’s 74.5%.
On ARC-AGI-1, TRM-Att scored 44.6%, surpassing HRM’s 40.3%.
And on the more challenging ARC-AGI-2, TRM-Att achieved 7.8% accuracy, higher than HRM’s 5.0%.

These results are particularly noteworthy because TRM achieves them with significantly fewer parameters (7M for TRM-Att vs. 27M for HRM), demonstrating remarkable efficiency.

Also Read:

The Future of Recursive Reasoning

TRM represents a significant step forward in solving complex reasoning tasks with small, efficient models. By simplifying the underlying mechanisms and focusing on effective recursion, it offers a powerful alternative to large, resource-intensive LLMs for specific problem types. While currently a supervised learning method, future work could explore extending TRM to generative tasks, allowing it to produce multiple possible solutions.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Tiny Networks Outperform Large Language Models on Complex Puzzles Through Simplified Recursive Reasoning

Introducing Tiny Recursive Models (TRM)

Simplifying Recursive Reasoning

Impressive Performance Gains

The Future of Recursive Reasoning

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Financial Sector Fortifies Against Surging AI-Powered Scams

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates