TLDR: d-Matrix, a leader in generative AI inference, has announced SquadRack, the industry’s first rack-scale solution specifically engineered for AI inference at datacenter scale. Developed in collaboration with industry giants like Arista, Broadcom, and Supermicro, SquadRack offers a blueprint for disaggregated, standards-based solutions designed for ultra-low latency batched inference, promising faster, more affordable, and energy-efficient AI operations.
SAN JOSE, Calif. – October 14, 2025 – d-Matrix, a pioneering force in generative AI inference, today unveiled SquadRackâ„¢, marking a significant advancement as the industry’s first rack-scale solution purpose-built for AI inference at datacenter scale. This groundbreaking announcement, made in collaboration with prominent AI infrastructure leaders Arista, Broadcom, and Supermicro, introduces a new blueprint for disaggregated, standards-based rack-scale solutions tailored for ultra-low latency batched inference.
At the core of the d-Matrix SquadRack architecture are innovative server nodes, meticulously packed with d-Matrix Corsairâ„¢ AI accelerators and d-Matrix JetStreamâ„¢ IO accelerators. This integration is designed to facilitate exceptionally fast and sustainable AI operations, addressing critical challenges in latency, cost, and energy consumption that currently limit large-scale AI deployments.
SquadRack leverages industry standards-based Ethernet, enabling seamless scalability across hundreds of nodes and multiple racks. This allows for the execution of complex AI models with remarkable speed. A single rack configuration, equipped with eight nodes, is capable of running generative AI models with up to 100 billion parameters at blazing fast speeds. For more extensive models or larger deployments, the system’s architecture supports scaling out to accommodate greater demands.
d-Matrix emphasizes an open standards hardware-software approach, ensuring that SquadRack offers easy plug-and-play integration for datacenter operators and developers. This commitment to open standards aims to democratize access to high-performance AI inference, making it more accessible and manageable for a wider range of enterprises.
Also Read:
- Rebellions and Arm Partner to Advance Next-Generation AI Data Center Infrastructure
- Red Hat AI 3 Unveiled: Empowering Enterprise AI with Scalable Production Inference and Agentic Capabilities
The introduction of SquadRack is poised to transform the economics of large-scale AI inference, delivering high efficiency, low latency, and a standards-based deployment model. This innovation underscores d-Matrix’s mission to provide faster, more affordable, and energy-efficient AI inference capabilities at datacenter scale, pushing the boundaries of what is possible in the rapidly evolving field of artificial intelligence.


