TLDR: OpenAI, in partnership with NVIDIA, has launched two new open-weight AI reasoning models, gpt-oss-120b and gpt-oss-20b, aimed at democratizing advanced AI development. These models, trained on NVIDIA H100 GPUs and optimized for NVIDIA’s CUDA platform and Blackwell architecture, offer high-efficiency inference, reaching 1.5 million tokens per second on GB200 NVL72 systems. This collaboration underscores a commitment to open-source innovation and making AI accessible across various industries and scales globally.
OpenAI, in a significant collaboration with NVIDIA, has introduced two groundbreaking open-weight AI reasoning models, gpt-oss-120b and gpt-oss-20b. These models are designed to extend cutting-edge AI development capabilities to a broad spectrum of users, including developers, enthusiasts, enterprises, startups, and governments worldwide, spanning every industry and scale.
NVIDIA’s involvement in the release of these open models, gpt-oss-120b and gpt-oss-20b, highlights its pivotal role in fostering community-driven innovation and expanding global access to AI technologies. The models are versatile, enabling the development of breakthrough applications in generative AI, reasoning AI, physical AI, healthcare, and manufacturing, potentially unlocking new industries as the AI-driven industrial revolution progresses.
The new flexible, open-weight text-reasoning large language models (LLMs) from OpenAI were trained using NVIDIA H100 GPUs. For optimal inference performance, they are designed to run efficiently on the hundreds of millions of GPUs powered by the NVIDIA CUDA platform globally. These models are now available as NVIDIA NIM microservices, facilitating easy deployment on any GPU-accelerated infrastructure while ensuring flexibility, data privacy, and enterprise-grade security.
Further enhancing their performance, the models feature software optimizations for the NVIDIA Blackwell platform. When deployed on NVIDIA GB200 NVL72 systems, they achieve an impressive inference rate of 1.5 million tokens per second, significantly boosting efficiency for AI inference tasks.
Jensen Huang, founder and CEO of NVIDIA, commented on the collaboration, stating, “OpenAI showed the world what could be built on NVIDIA AI — and now they’re advancing innovation in open-source software.” He added, “The gpt-oss models let developers everywhere build on that state-of-the-art open-source foundation, strengthening U.S. technology leadership in AI — all on the world’s largest AI compute infrastructure.”
The article emphasizes that NVIDIA Blackwell is crucial for advanced reasoning, as the demand on compute infrastructure escalates with advanced reasoning models like gpt-oss generating exponentially more tokens. Blackwell architecture is purpose-built to meet this demand, offering the necessary scale, efficiency, and return on investment for high-level inference. Innovations within NVIDIA Blackwell include NVFP4 4-bit precision, which enables ultra-efficient, high-accuracy inference while substantially reducing power and memory requirements. This technology makes it feasible to deploy trillion-parameter LLMs in real-time, potentially generating billions of dollars in value for organizations.
For open development, NVIDIA CUDA stands as the world’s most widely available computing infrastructure, allowing users to deploy and run AI models across various platforms, from NVIDIA DGX Cloud to NVIDIA GeForce RTX and NVIDIA RTX PRO-powered PCs and workstations. With over 450 million NVIDIA CUDA downloads to date, the vast community of CUDA developers now gains access to these latest models, optimized for their existing NVIDIA technology stack.
OpenAI and NVIDIA’s commitment to open-source software is further demonstrated through their collaboration with leading open framework providers. They have provided model optimizations for FlashInfer, Hugging Face, llama.cpp, Ollama, and vLLM, in addition to NVIDIA Tensor-RT LLM and other libraries, offering developers flexibility in their framework choices.
Also Read:
- OpenAI’s Advanced Open-Weight Models Now Available on Amazon Web Services
- U.S. Government Adds OpenAI, Google, and Anthropic to Approved AI Vendor List
This collaboration builds on a long history, dating back to 2016 when Jensen Huang personally delivered the first NVIDIA DGX-1 AI supercomputer to OpenAI’s headquarters. By optimizing OpenAI’s gpt-oss models for NVIDIA Blackwell and RTX GPUs, alongside NVIDIA’s extensive software stack, NVIDIA is facilitating faster, more cost-effective AI advancements for its 6.5 million developers across 250 countries, who utilize over 900 NVIDIA software development kits and AI models.


