OpenAI

GPT-4.1

Name: OpenAI GPT-4.1
Author: OpenAI

Compare

Model ID:gpt-4.1-2025-04-14

2025-04-14

Compare

GPT-4.1 is OpenAI's flagship language model optimized for coding, instruction following, and long-context reasoning, released in April 2025. It supports a 1-million-token context window — over 8× the capacity of GPT-4o — and achieves 54.6% on SWE-bench Verified, representing a major improvement in real-world software engineering tasks. The model excels at precise code diffs, agent reliability, and high recall across large document contexts, making it well-suited for IDE tooling, automated coding agents, and enterprise knowledge retrieval.

API|VisionWeb SearchFile|Proprietary Model

Knowledge Cutoff

2024-06-30

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

1.0MIN33KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$2IN$8OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1312

±4

As of 2026-04-23

Overall Rank

No.207

100,105 Votes

Arena by Ability

Hard Prompts

1311±6No.213

Expert Knowledge

1286±12No.206

Instruction Following

1294±6No.205

Conversation Memory

1298±8No.206

Creative

1285±8No.194

Coding

1338±7No.214

Math

1303±8No.184

Arena by Occupation

Creative Writing

1306±6No.188

Social Sciences

1321±8No.211

Media

1290±8No.182

Business

1282±9No.226

Healthcare

1305±12No.212

Legal

1317±11No.215

Software

1324±6No.221

Mathematics

1308±8No.186

Source:Arena Intelligence

Overall

AA Intelligence Index

26%↓12%

ForecastBench

59%↑0%

Reasoning & Math

AA Math Index

35%↓39%

GPQA Diamond

67%↓14%

HLE

4.6%↓13%

MMLU-Pro

81%↓1%

AIME 2025

35%↓39%

MATH-500

91%↓2%

Coding

AA Coding Index

22%↓12%

LiveCodeBench

46%↓20%

TAU2

47%↓26%

TerminalBench

14%↓17%

SciCode

38%↓3%

Language & Instructions

IFBench

43%↓14%

AA-LCR

61%↓1%

Hallucination (HHEM)

5.6%↓5%

Factual (HHEM)

94%↑4%

Output Speed

Standard Mode

103tok/s↑21

First Output 0.58s

Source:Artificial Analysis ForecastBench Vectara HHEM

OpenAI