OpenAI Unveils gpt-oss-120b and gpt-oss-20b: New Open-Weight Models for Agentic AI

TLDR: OpenAI has released gpt-oss-120b and gpt-oss-20b, two new open-weight reasoning models designed for agentic AI workflows, featuring strong instruction following, tool use (web search, Python), and variable reasoning effort. Built on MoE transformers, they are optimized for memory efficiency and trained on vast datasets with safety filters. Evaluations show strong performance in reasoning, coding, tool use, health, and multilingual tasks, often competitive with or surpassing smaller closed models. Extensive safety testing, including adversarial fine-tuning simulations, confirms they do not reach high-risk capability thresholds in biological/chemical or cyber domains, and their release does not significantly advance the frontier of hazardous open-source AI capabilities.

OpenAI has introduced two new open-weight reasoning models, gpt-oss-120b and gpt-oss-20b, designed to be accessible and customizable for a wide range of applications. These models are released under the Apache 2.0 license and OpenAI’s gpt-oss usage policy, marking a significant step towards broader availability of advanced AI capabilities.

These text-only models are built for agentic workflows, meaning they excel at following complex instructions, using tools like web search and Python code execution, and performing sophisticated reasoning. A unique feature is their ability to adjust reasoning effort, allowing them to be more efficient for simpler tasks while still providing detailed, full chain-of-thought (CoT) outputs for complex problems. They also support Structured Outputs, making them versatile for various development needs.

Under the Hood: Architecture and Training

The gpt-oss models are based on autoregressive Mixture-of-Experts (MoE) transformers, evolving from the well-known GPT-2 and GPT-3 architectures. The larger gpt-oss-120b boasts 116.8 billion total parameters, with 5.1 billion active parameters per token, while the more compact gpt-oss-20b has 20.9 billion total parameters and 3.6 billion active parameters. To make these powerful models more accessible, OpenAI utilized quantization, specifically MXFP4 format, for the MoE weights. This innovation allows the 120b model to run on a single 80GB GPU and the 20b model on systems with as little as 16GB memory.

The models employ a custom tokenizer, o200k_harmony, which is open-sourced in OpenAI’s TikToken library. Pre-training involved trillions of text-only tokens, with a strong emphasis on STEM (Science, Technology, Engineering, and Mathematics), coding, and general knowledge. Crucially, the training data was filtered for harmful content, particularly related to hazardous biosecurity knowledge, to enhance safety. The models’ knowledge cutoff is June 2024.

Post-training, the gpt-oss models were fine-tuned using advanced Chain-of-Thought Reinforcement Learning (CoT RL) techniques, similar to those used for OpenAI’s o3 models. This process imbues them with strong reasoning and problem-solving abilities, giving them a “personality” akin to models found in OpenAI’s first-party products like ChatGPT.

Advanced Features: Harmony Chat and Tool Use

A key innovation is the “harmony chat format,” a custom structure for model training that uses special tokens and keyword arguments to define message boundaries and roles (System, Developer, User, Assistant). This format supports “channels” for different levels of message visibility, enabling sophisticated agentic features such as interleaving tool calls within the CoT or outlining detailed action plans to the user. Proper deployment of this format is essential to unlock the models’ full potential.

The models also support “Variable Effort Reasoning Training,” allowing users to specify “low,” “medium,” or “high” reasoning levels in the system prompt. Increasing the reasoning level leads to longer CoTs and higher accuracy, though at the cost of increased latency and computational expense. This flexibility allows users to balance performance with resource usage.

Agentic tool use is another highlight. The models are trained to interact with a browsing tool for web search and information retrieval beyond their knowledge cutoff, a Python tool for code execution in a Jupyter notebook environment, and arbitrary developer-defined functions. This capability significantly enhances their utility for complex, real-world tasks.

Safety and Preparedness: A Core Focus

Safety is a foundational principle for these open models. OpenAI recognizes that open-weight models present a different risk profile, as they can be fine-tuned by malicious actors to bypass safety measures. To address this, the gpt-oss models are trained to adhere to OpenAI’s safety policies by default, employing techniques like deliberative alignment to refuse illicit content and resist “jailbreaks” (prompts designed to circumvent safety refusals).

OpenAI conducted extensive “Preparedness Framework” evaluations on gpt-oss-120b, including simulating adversarial fine-tuning for biological/chemical and cybersecurity risks. The findings indicate that even with robust fine-tuning, gpt-oss-120b did not reach OpenAI’s “High capability” thresholds in these sensitive domains. Furthermore, the release of gpt-oss-120b is not expected to significantly advance the frontier of hazardous biological capabilities in open foundation models, as other existing open models already demonstrate comparable performance.

Evaluations of default safety performance show that gpt-oss-120b and gpt-oss-20b generally perform on par with OpenAI o4-mini in handling disallowed content and resisting jailbreaks. While they may underperform o4-mini in strictly adhering to instruction hierarchies (e.g., preventing users from extracting system prompts), they remain robust to known jailbreaks. Developers have the option to fine-tune these models further for enhanced robustness.

It’s important to note that the models’ chains of thought are not directly optimized for safety, meaning they can contain “hallucinated” content or language that doesn’t align with OpenAI’s safety policies. Developers are advised to filter or moderate CoTs before exposing them to end-users. As expected for smaller models, gpt-oss models show a higher hallucination rate on fact-seeking questions compared to larger frontier models, though browsing capabilities can mitigate this.

Also Read:

Performance Across Benchmarks

The gpt-oss models were rigorously evaluated across various benchmarks:

Reasoning and Factuality: Strong performance in math (AIME), leveraging long CoTs. Competitive on GPQA, HLE, and MMLU.
Coding and Tool Use: Particularly strong in coding and tool-use tasks (Codeforces, SWE-Bench, τ-Bench Retail), with gpt-oss-120b approaching OpenAI o4-mini’s performance.
Health Performance: gpt-oss-120b performs competitively with leading closed models like OpenAI o3 on HealthBench, even outperforming GPT-4o and other OpenAI models in some health-related conversations. This suggests significant potential for making health intelligence more widely accessible, though it’s crucial to remember these models are not a substitute for medical professionals.
Multilingual Capabilities: Evaluated on MMMLU across 14 languages, gpt-oss-120b demonstrates performance close to OpenAI o4-mini-high.

OpenAI’s commitment to advancing beneficial AI and raising safety standards is evident in this release. By providing these powerful yet carefully evaluated open-weight models, OpenAI aims to foster innovation while responsibly managing the associated risks. For more in-depth technical details, you can refer to the full research paper.

For further details, you can read the full research paper: gpt-oss-120b & gpt-oss-20b Model Card.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

OpenAI Unveils gpt-oss-120b and gpt-oss-20b: New Open-Weight Models for Agentic AI

Under the Hood: Architecture and Training

Advanced Features: Harmony Chat and Tool Use

Safety and Preparedness: A Core Focus

Performance Across Benchmarks

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

SOCi Achieves Major Milestone with 150,000 AI Agents Automating 10 Million Local Marketing Tasks

TD Synnex Unveils Agentic AI-Powered Digital Bridge to Revolutionize Partner Sales and Productivity

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates