OpenAI
OpenAI

GPT OSS 120B

2025-08-05

GPT-OSS-120B is OpenAI's first open-weight language model, featuring 117 billion total parameters in a Mixture-of-Experts architecture that activates just 5.1 billion per forward pass. Optimized to run on a single 80GB GPU with native MXFP4 quantization, it achieves near-parity with o4-mini on core reasoning benchmarks while supporting configurable reasoning depth, full chain-of-thought access, and native tool use including function calling and structured outputs. Released under the Apache 2.0 license, it brings frontier-level reasoning and agentic capabilities to a fully customizable, locally deployable model.

API|Reasoning|Open ModelApache 2.0
Knowledge Cutoff
2024-06-30
Input → Output Format
Context Memory
131KIN131KOUT
Cost/1M Words
$0.039IN$0.19OUT
Calculate Cost

AI Performance Evaluation

Arena Overall Score
1353
±4
As of 2026-04-23
Overall Rank
No.149
30,674 Votes
Arena by Ability
Hard Prompts
1363±6No.156
Expert Knowledge
1360±17No.147
Instruction Following
1326±7No.163
Conversation Memory
1328±9No.171
Creative
1279±10No.203
Coding
1390±8No.154
Math
1383±14No.125
Arena by Occupation
Creative Writing
1310±8No.177
Social Sciences
1361±9No.160
Media
1287±8No.185
Business
1350±8No.154
Healthcare
1369±15No.151
Legal
1345±14No.171
Software
1386±6No.153
Mathematics
1384±15No.125
Overall
AA Intelligence Index
25%↓14%
LiveBench
46%↓14%
Reasoning & Math
AA Math Index
67%↓7%
GPQA Diamond
67%↓14%
HLE
5.2%↓12%
MMLU-Pro
78%↓4%
AIME 2025
67%↓7%
LB Reasoning
39%↓20%
LB Math
69%↓5%
LB Data
39%↓11%
Coding
AA Coding Index
16%↓19%
LiveCodeBench
71%↑5%
LB Coding
60%↓13%
LB Agentic
17%↓27%
TAU2
45%↓28%
TerminalBench
5.3%↓26%
SciCode
36%↓5%
Language & Instructions
IFBench
58%↑2%
AA-LCR
44%↓18%
Hallucination (HHEM)
14%↑4%
Factual (HHEM)
86%↓4%
LB Language
49%↓23%
LB IF
50%↑4%
Output Speed
Standard Mode
86tok/s↑4
First Output 0.48s
Reasoning Mode
214tok/s↑126
First Output 9.89s