Claude Opus 4 is Anthropic's breakthrough coding and agent model released in May 2025, setting new standards for sustained performance on complex, long-running tasks. It leads on SWE-bench (72.5%) and Terminal-bench (43.2%), and can handle agentic workflows spanning thousands of task steps continuously for hours without degradation. As a hybrid model, it offers both near-instant responses and extended thinking for deeper reasoning, with parallel tool use and improved instruction memory.
API|VisionReasoningWeb SearchFile|Proprietary Model
Knowledge Cutoff
2025-05-01
Input → Output Format
Context Memory
1MIN128KOUT
AI Performance Evaluation
Arena Overall Score
1424
±4As of 2026-04-23
Overall Rank
No.66
36,951 Votes
Arena by Ability
Hard Prompts
1455±6No.53
Expert Knowledge
1446±14No.66
Instruction Following
1442±7No.33
Conversation Memory
1437±8No.55
Creative
1429±9No.33
Coding
1498±8No.36
Math
1419±12No.70
Arena by Occupation
Creative Writing
1429±7No.37
Social Sciences
1439±8No.70
Media
1420±8No.39
Business
1412±8No.82
Healthcare
1446±13No.69
Legal
1436±12No.65
Software
1466±6No.53
Mathematics
1424±13No.68
Source:Arena Intelligence
Overall
AA Intelligence Index
39%↑1%
ForecastBench
61%↑2%
Reasoning & Math
AA Math Index
73%↑0%
GPQA Diamond
80%↓1%
HLE
12%↓5%
MMLU-Pro
87%↑5%
AIME 2025
73%↑0%
MATH-500
98%↑5%
Coding
AA Coding Index
34%↑0%
LiveCodeBench
64%↓2%
TAU2
73%↑0%
TerminalBench
31%↑0%
SciCode
40%↓1%
Language & Instructions
IFBench
54%↓3%
AA-LCR
34%↓28%
Hallucination (HHEM)
12%↑2%
Factual (HHEM)
88%↓2%
Output Speed
Standard Mode
34tok/s↓48
First Output 1.33s
Reasoning Mode
48tok/s↓40
First Output 7.45s