OpenAI

GPT-5.4

Name: OpenAI GPT-5.4
Author: OpenAI

Compare

Model ID:gpt-5.4-2026-03-05

2026-03-05

Compare

GPT-5.4 is OpenAI's latest frontier model released in March 2026, unifying the Codex and GPT product lines into a single system. It features a 1M+ token context window, native computer-use capabilities, and industry-leading coding performance inherited from GPT-5.3-Codex. The model is significantly more token-efficient than GPT-5.2, and achieves state-of-the-art results on knowledge work benchmarks, matching or exceeding industry professionals in 83% of comparisons across 44 occupations. It excels at agentic coding, document understanding, tool use, and complex multi-step workflows.

OpenAI PlusOpenAI ProAPI|VisionReasoningWeb SearchFile|Proprietary Model

Knowledge Cutoff

2025-08-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

1.1MIN128KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$2.5IN$15OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenAI GPT-5 Blog LMSYS Chatbot Arena OpenRouter

AI Performance Evaluation

Arena Overall Score

1481

±6

As of 2026-04-23

Overall Rank

No.9

13,593 Votes

Arena by Ability

Hard Prompts

1503±7No.8

Expert Knowledge

1526±19No.5

Instruction Following

1481±10No.7

Conversation Memory

1497±12No.8

Creative

1448±14No.15

Coding

1532±11No.6

Math

1515±20🥈 No.2

Arena by Occupation

Creative Writing

1470±11No.7

Social Sciences

1479±13No.24

Media

1448±13No.14

Business

1477±12No.10

Healthcare

1475±20No.32

Legal

1471±19No.25

Software

1513±9No.9

Mathematics

1516±22🥉 No.3

Source:Arena Intelligence

Overall

AA Intelligence Index

57%↑18%

LiveBench

81%↑21%

ForecastBench

58%↓1%

Reasoning & Math

GPQA Diamond

92%↑11%

HLE

42%↑25%

LB Reasoning

88%↑28%

LB Math

94%↑21%

LB Data

79%↑30%

Coding

AA Coding Index

57%↑23%

LB Coding

78%↑4%

LB Agentic

70%↑27%

TAU2

87%↑14%

TerminalBench

58%↑26%

SciCode

57%↑16%

Language & Instructions

IFBench

74%↑17%

AA-LCR

74%↑12%

Hallucination (HHEM)

7.0%↓3%

Factual (HHEM)

93%↑3%

LB Language

83%↑11%

LB IF

70%↑24%

Output Speed

Standard Mode

155tok/s↑73

First Output 0.55s

Reasoning Mode

152tok/s↑64

First Output 7.32s

Source:Artificial Analysis LiveBench ForecastBench Vectara HHEM

Multilingual Capabilities

MGSM 🇰🇷

94%

MGSM 🇯🇵

92%

KMMLU 🇰🇷

77%

JMMLU 🇯🇵

75%

OpenAI