AI Model Comparison

Our Story

Claude Opus 4 is Anthropic's breakthrough coding and agent model released in May 2025, setting new standards for sustained performance on complex, long-running tasks. It leads on SWE-bench (72.5%) and Terminal-bench (43.2%), and can handle agentic workflows spanning thousands of task steps continuously for hours without degradation. As a hybrid model, it offers both near-instant responses and extended thinking for deeper reasoning, with parallel tool use and improved instruction memory.

Author

Anthropic

Release Date

2025-05-22

Knowledge Cutoff

2025-05-01

License

Proprietary

I/O Format

Context Length

1M / 128K

API I/O (1M)

$15 / $75

How to Use

API Access

Output Speed

34 tok/s

Arena Overall

1424

Intelligence Index

39.0

Coding Index

34.0

Math Index

73.3

LiveBench

—

ForecastBench

60.6

GPQA Diamond

79.6%

HLE

11.7%

MMLU-Pro

87.3%

AIME 2025

73.3%

MATH-500

98.2%

LB Reasoning

—

LB Math

—

LB Data Analysis

—

LiveCodeBench

63.6%

LB Coding

—

LB Agentic

—

TAU2

73.4%

TerminalBench

31.1%

SciCode

39.8%

IFBench

53.7%

AA-LCR

0.3

Hallucination (HHEM)

12.0%

Factual Consistency (HHEM)

88.0%

LB Language

—

LB Instruction Following

—

Calculate Cost View Model Details

1 / 3

Swipe to compare