OpenAI and NVIDIA Unveil New Open-Weight AI Models for Global Inference Infrastructure

TLDR: OpenAI, in partnership with NVIDIA, has launched two new open-weight AI reasoning models, gpt-oss-120b and gpt-oss-20b, aimed at democratizing advanced AI development. These models, trained on NVIDIA H100 GPUs and optimized for NVIDIA’s CUDA platform and Blackwell architecture, offer high-efficiency inference, reaching 1.5 million tokens per second on GB200 NVL72 systems. This collaboration underscores a commitment to open-source innovation and making AI accessible across various industries and scales globally.

OpenAI, in a significant collaboration with NVIDIA, has introduced two groundbreaking open-weight AI reasoning models, gpt-oss-120b and gpt-oss-20b. These models are designed to extend cutting-edge AI development capabilities to a broad spectrum of users, including developers, enthusiasts, enterprises, startups, and governments worldwide, spanning every industry and scale.

NVIDIA’s involvement in the release of these open models, gpt-oss-120b and gpt-oss-20b, highlights its pivotal role in fostering community-driven innovation and expanding global access to AI technologies. The models are versatile, enabling the development of breakthrough applications in generative AI, reasoning AI, physical AI, healthcare, and manufacturing, potentially unlocking new industries as the AI-driven industrial revolution progresses.

The new flexible, open-weight text-reasoning large language models (LLMs) from OpenAI were trained using NVIDIA H100 GPUs. For optimal inference performance, they are designed to run efficiently on the hundreds of millions of GPUs powered by the NVIDIA CUDA platform globally. These models are now available as NVIDIA NIM microservices, facilitating easy deployment on any GPU-accelerated infrastructure while ensuring flexibility, data privacy, and enterprise-grade security.

Further enhancing their performance, the models feature software optimizations for the NVIDIA Blackwell platform. When deployed on NVIDIA GB200 NVL72 systems, they achieve an impressive inference rate of 1.5 million tokens per second, significantly boosting efficiency for AI inference tasks.

Jensen Huang, founder and CEO of NVIDIA, commented on the collaboration, stating, “OpenAI showed the world what could be built on NVIDIA AI — and now they’re advancing innovation in open-source software.” He added, “The gpt-oss models let developers everywhere build on that state-of-the-art open-source foundation, strengthening U.S. technology leadership in AI — all on the world’s largest AI compute infrastructure.”

The article emphasizes that NVIDIA Blackwell is crucial for advanced reasoning, as the demand on compute infrastructure escalates with advanced reasoning models like gpt-oss generating exponentially more tokens. Blackwell architecture is purpose-built to meet this demand, offering the necessary scale, efficiency, and return on investment for high-level inference. Innovations within NVIDIA Blackwell include NVFP4 4-bit precision, which enables ultra-efficient, high-accuracy inference while substantially reducing power and memory requirements. This technology makes it feasible to deploy trillion-parameter LLMs in real-time, potentially generating billions of dollars in value for organizations.

For open development, NVIDIA CUDA stands as the world’s most widely available computing infrastructure, allowing users to deploy and run AI models across various platforms, from NVIDIA DGX Cloud to NVIDIA GeForce RTX and NVIDIA RTX PRO-powered PCs and workstations. With over 450 million NVIDIA CUDA downloads to date, the vast community of CUDA developers now gains access to these latest models, optimized for their existing NVIDIA technology stack.

OpenAI and NVIDIA’s commitment to open-source software is further demonstrated through their collaboration with leading open framework providers. They have provided model optimizations for FlashInfer, Hugging Face, llama.cpp, Ollama, and vLLM, in addition to NVIDIA Tensor-RT LLM and other libraries, offering developers flexibility in their framework choices.

Also Read:

This collaboration builds on a long history, dating back to 2016 when Jensen Huang personally delivered the first NVIDIA DGX-1 AI supercomputer to OpenAI’s headquarters. By optimizing OpenAI’s gpt-oss models for NVIDIA Blackwell and RTX GPUs, alongside NVIDIA’s extensive software stack, NVIDIA is facilitating faster, more cost-effective AI advancements for its 6.5 million developers across 250 countries, who utilize over 900 NVIDIA software development kits and AI models.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

OpenAI and NVIDIA Unveil New Open-Weight AI Models for Global Inference Infrastructure

Gen AI News and Updates

AI’s Hyper-Growth Unlocked: OpenAI’s $500B Valuation Forces a Capital Re-evaluation for Investors

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

Ghana Navigates Complexities in AI Regulatory Development Amidst Coordination Challenges

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

OneShield Achieves Landmark Registration Under Cloud Security Alliance AI Controls Matrix, Setting New Industry Standard

SeedAI Leads Utah’s Proactive Initiative for Ethical AI Integration in Business

Bahrain Commended for AI Preparedness in New UNESCO Global Report

U.S. Air Force Secures Skydio Drone Technology for Enhanced Autonomous Operations

Malaysia Forges Ahead with AI Development, Prioritizing Governance and Ethical Frameworks

Contractify Honored as Top Contract Management Solution Provider for 2025 by LegalTech Breakthrough Awards

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Astreya Unveils New Wave of Enterprise AI Agents to Boost Business Efficiency and Automation

EPAM Honored with Microsoft’s 2025 Innovate with Azure AI Platform Partner of the Year Award for Pioneering AI Solutions

EBU Academy’s School of AI Honored with European Digital Skills Award for Upskilling Media Professionals

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Netherlands Unveils Ambitious AI Strategy to Shape Global Governance Frameworks

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Prepify AI and ZoraSafe, Inc. Honored with ‘Panelists’ Choice’ Awards at UF Innovate’s GatorPitch in Miami

Subscribe to get the latest news and updates