OpenAI

GPT OSS 120B

Name: OpenAI GPT OSS 120B
Author: OpenAI

Try It Compare

Model ID:openai/gpt-oss-120b

2025-08-05

Try It Compare

GPT-OSS-120B is OpenAI's first open-weight language model, featuring 117 billion total parameters in a Mixture-of-Experts architecture that activates just 5.1 billion per forward pass. Optimized to run on a single 80GB GPU with native MXFP4 quantization, it achieves near-parity with o4-mini on core reasoning benchmarks while supporting configurable reasoning depth, full chain-of-thought access, and native tool use including function calling and structured outputs. Released under the Apache 2.0 license, it brings frontier-level reasoning and agentic capabilities to a fully customizable, locally deployable model.

API|Reasoning|Open ModelApache 2.0

Knowledge Cutoff

2024-06-30

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

131KIN131KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$0.039IN$0.19OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1353

±4

As of 2026-04-23

Overall Rank

No.149

30,674 Votes

Arena by Ability

Hard Prompts

1363±6No.156

Expert Knowledge

1360±17No.147

Instruction Following

1326±7No.163

Conversation Memory

1328±9No.171

Creative

1279±10No.203

Coding

1390±8No.154

Math

1383±14No.125

Arena by Occupation

Creative Writing

1310±8No.177

Social Sciences

1361±9No.160

Media

1287±8No.185

Business

1350±8No.154

Healthcare

1369±15No.151

Legal

1345±14No.171

Software

1386±6No.153

Mathematics

1384±15No.125

Source:Arena Intelligence

Overall

AA Intelligence Index

25%↓14%

LiveBench

46%↓14%

Reasoning & Math

AA Math Index

67%↓7%

GPQA Diamond

67%↓14%

HLE

5.2%↓12%

MMLU-Pro

78%↓4%

AIME 2025

67%↓7%

LB Reasoning

39%↓20%

LB Math

69%↓5%

LB Data

39%↓11%

Coding

AA Coding Index

16%↓19%

LiveCodeBench

71%↑5%

LB Coding

60%↓13%

LB Agentic

17%↓27%

TAU2

45%↓28%

TerminalBench

5.3%↓26%

SciCode

36%↓5%

Language & Instructions

IFBench

58%↑2%

AA-LCR

44%↓18%

Hallucination (HHEM)

14%↑4%

Factual (HHEM)

86%↓4%

LB Language

49%↓23%

LB IF

50%↑4%

Output Speed

Standard Mode

86tok/s↑4

First Output 0.48s

Reasoning Mode

214tok/s↑126

First Output 9.89s

Source:Artificial Analysis LiveBench Vectara HHEM OpenRouter

OpenAI