TLDR: AMD has announced significant advancements in AI efficiency and scalability with its Instinct GPUs, as showcased in the recent MLPerf Inference v5.1 tests. The results highlight the cost-effectiveness of AMD’s solutions for generative AI deployments, with strong performance from the Instinct MI355X and MI325X GPUs in key benchmarks like Llama 2 70B and Stable-diffusion-XL.
AMD has once again demonstrated its growing prowess in the artificial intelligence landscape, revealing impressive efficiency gains and scalability for its Instinct GPUs in the latest MLPerf Inference v5.1 benchmarks. The company’s submissions underscore its commitment to delivering high-performance, cost-effective solutions for the rapidly expanding generative AI market.
The MLPerf Inference v5.1 results, published on September 9, 2025, highlight that AMD’s Instinct GPUs offer ‘breakthrough AI efficiency and scalability,’ positioning them as a strong contender for generative AI deployments. Key to these achievements are the Instinct MI355X and MI325X GPUs, which have shown competitive performance in demanding AI workloads.
Specifically, the Instinct MI325X GPU, featured in MLPerf Inference v5.0 results released on April 2, 2025, showcased robust capabilities in benchmarks such as Llama 2 70B and Stable-diffusion-XL. The Llama 2 70B benchmark, a popular measure within the MLPerf Inference suite, saw AMD’s solutions delivering ‘highly competitive results.’ Notably, a 4-node MI300X cluster, enabled by Mangoboost’s LLMboost stack, achieved an impressive 103K tokens/sec in Llama 2 70B Offline performance, marking the highest-ever recorded in MLPerf submissions for this benchmark.
AMD’s success is not solely attributed to hardware. The company emphasizes the role of its ROCm software optimization, which is crucial for unlocking top-tier performance and ensuring price and efficiency advantages in large-scale AI workloads. The optimizations behind these results include quantization, General Matrix Multiplication (GEMM) tuning, cutting-edge vLLM scheduling, and platform enhancements.
Furthermore, AMD’s collaborative approach was evident through its partnerships with industry leaders such as Supermicro, Giga Computing, and AsusTek, enabling them to submit MI325X-based system solutions. This ecosystem support further validates the scalability and performance of AMD Instinct solutions in real-world AI applications.
Also Read:
- AMD Launches Major Open-Source AI and GPU Programming Initiative in India, Targeting 100,000 Graduates and Free Cloud Access
- NVIDIA Unveils Rubin CPX GPU, Revolutionizing AI Inference for Million-Token Contexts
The competitive landscape also saw NVIDIA’s Blackwell Ultra GB300 appearing in the MLPerf v5.1 benchmarks, indicating a fierce race for AI leadership. However, AMD’s consistent performance and focus on efficiency and scalability demonstrate its strong position in the evolving AI infrastructure market.


