TLDR: RaiderChip and iWave have announced a collaboration to accelerate Large Language Models (LLMs) on AMD Versal AI Edge VE2302 System on Modules. This partnership aims to provide efficient, high-performance, and secure AI solutions for edge devices and data centers, enabling advanced generative AI applications with impressive processing speeds.
In a significant stride towards advancing generative AI capabilities at the edge, RaiderChip and iWave have joined forces to deliver high-performance Large Language Model (LLM) acceleration on the AMD Versal AI Edge VE2302 System on Module (SoM). This collaboration addresses the escalating demand for efficient and powerful computing platforms as generative AI continues to transform various industries.
The Versal AI Edge VE2302 SoM stands out as a robust solution, leveraging the adaptability of AMD Versal Adaptive SoCs. It offers an optimal balance of compute efficiency, power optimization, and AI acceleration, making it an ideal choice for next-generation AI applications across edge devices and data centers.
iWave, a global leader in the design and manufacturing of FPGA System on Modules with over 25 years of experience, expressed enthusiasm for the partnership. “iWave is thrilled to collaborate with RaiderChip to drive innovation in AI acceleration and together deliver high-performance GenAI LLM acceleration, seamlessly integrating hardware and software and bringing AI intelligence to edge devices,” stated a representative.
RaiderChip specializes in cutting-edge hardware accelerators designed to revolutionize the deployment and operation of LLMs at the edge. Their solutions prioritize unmatched performance, privacy, and flexibility, eliminating the need for expensive external CPUs. This enables LLMs to run locally, offline, and on-premises, fostering autonomous and customizable generative AI products.
A key highlight of this collaboration is the demonstrated performance: RaiderChip’s Meta Llama 3.2 LLM, when deployed on the iW-RainboW-G57M Versal 2302 System on Module, achieves an impressive 12 tokens per second. This speed ensures fast and efficient AI performance, making the model highly versatile and easy to integrate into various edge devices.
The benefits of RaiderChip’s Edge AI solutions are extensive, including full privacy and independence by keeping sensitive data on-premises, offline operation without network connectivity, and reprogrammability for seamless updates. The solutions also support customizable models, allowing for the use of commercially friendly licensed and open-source LLMs like Llama, or fine-tuned models for specific tasks.
Applications for this accelerated technology are broad, encompassing chatbots, predictive analytics, real-time AI assistants, industrial control, home automation, defense, and medical devices. The energy-efficient design allows for generative AI solutions powered by an NPU inference engine to operate off-grid, potentially using solar panels, without requiring external power infrastructure. Furthermore, the absence of cloud reliance enhances cybersecurity and confidentiality, protecting against external cyberattacks and data leaks.
RaiderChip emphasizes a “facts over words” approach, offering interactive demos that allow potential customers to directly evaluate the capabilities of their solutions on-site. These demos showcase the performance of various Meta Llama and Microsoft Phi models on cost-effective AMD Versal FPGA devices, achieving interactive speeds in both vanilla and quantized models.
Also Read:
- Alif Semiconductor Boosts Edge AI with ExecuTorch Runtime Support on Ensemble MCUs for Generative Capabilities
- Anthropic and AWS Bolster AI Agent Development and Deployment for Customers
This partnership sets a new standard in generative AI acceleration, empowering businesses to deploy robust, reliable, and private AI solutions at scale at the edge.


