TLDR: Hailo has launched the Hailo-10H, the first edge AI chip specifically designed to run generative AI models, including LLMs and VLMs, directly on devices. This innovation promises ultra-low latency, high efficiency, enhanced data privacy, and reduces reliance on cloud infrastructure, marking a significant step in making advanced AI accessible at the edge.
Tel Aviv, Israel – August 1, 2025 – Hailo, a leading innovator in edge AI processors, has officially announced the commercial availability of its groundbreaking Hailo-10H AI accelerator. This second-generation chip is heralded as the first discrete AI processor to bring robust generative AI capabilities directly to edge devices, fundamentally shifting how large language models (LLMs), vision-language models (VLMs), and other multi-modal AI applications are deployed.
The Hailo-10H is engineered to execute these complex generative AI workloads entirely on-device, thereby eliminating the need for constant cloud connectivity. This on-device processing capability offers several critical advantages, including ultra-low latency, enhanced data privacy by keeping sensitive information local, and significant reductions in cloud bandwidth usage and associated costs. The chip operates with remarkable power efficiency, typically consuming just 2.5 watts, making it ideal for a wide array of edge applications where power constraints are a concern.
Performance metrics for the Hailo-10H are impressive, featuring 40 tera-operations per second (TOPS) of INT4 performance and 20 TOPS of INT8. According to Hailo, the chip can achieve first-token generation in under one second and sustain 10 tokens per second on 2-billion parameter LLMs. It also demonstrates the ability to generate images with Stable Diffusion 2.1 in less than five seconds, showcasing its versatility for various generative AI tasks.
Orr Danon, CEO and co-founder of Hailo, stated, “With the Hailo-10H now available for order, we’re taking another major step toward our mission of making AI accessible to all. This is the first discrete AI processor to bring real generative AI performance to the edge, combining high efficiency, cost-effectiveness, and a robust software ecosystem.” The chip is fully compatible with Hailo’s established software stack, which is utilized by over 10,000 active developers monthly, facilitating seamless integration and development.
Beyond its generative AI prowess, the Hailo-10H is designed to work effectively in hybrid AI pipelines, blending generative models with traditional convolutional neural networks (CNNs) for tasks like real-time video analytics. Its capabilities extend to real-time 4K video analytics and it is automotive-qualified to AEC-Q100 Grade 2 standards, with automotive production targeting 2026. Early adopters include HP, which plans to integrate the Hailo-10H into its HP AI Accelerator M.2 Card for point-of-sale (POS) systems.
Also Read:
- Edge AI Chip Market Poised for Significant Growth, Projections Vary Towards 2034
- NimbleEdge Unveils DeliteAI SDK to Revolutionize On-Device AI Development
The launch of the Hailo-10H comes at a pivotal time, as the edge AI accelerator market is projected to experience substantial growth, from an estimated $10.13 billion in 2025 to $113.71 billion by 2034. This growth is largely driven by increasing demand for privacy-first, low-latency AI processing solutions, a demand that the Hailo-10H is uniquely positioned to meet.


