MiniMax M2.5 is a frontier language model trained with reinforcement learning across hundreds of thousands of complex real-world environments, achieving state-of-the-art scores of 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp. Building on the coding expertise of M2.1, it extends into general office productivity — generating and operating Word, Excel, and PowerPoint files, context-switching between diverse software environments, and collaborating across agent and human teams. It completes evaluations 37% faster than M2.1 while being cost-efficient enough to run continuously for $1 per hour.
API|Reasoning|Open ModelModified MIT
Knowledge Cutoff
Unknown
Input → Output Format
Context Memory
197KIN66KOUT
AI Performance Evaluation
Arena Overall Score
1400
±5As of 2026-04-23
Overall Rank
No.100
21,236 Votes
Arena by Ability
Hard Prompts
1425±6No.91
Expert Knowledge
1440±15No.71
Instruction Following
1396±8No.92
Conversation Memory
1408±10No.92
Creative
1376±10No.94
Coding
1456±9No.87
Math
1411±15No.81
Arena by Occupation
Creative Writing
1384±9No.93
Social Sciences
1408±11No.107
Media
1382±10No.84
Business
1412±10No.83
Healthcare
1405±16No.116
Legal
1411±16No.94
Software
1442±7No.90
Mathematics
1416±18No.83
Source:Arena Intelligence
Overall
AA Intelligence Index
42%↑4%
LiveBench
60%↑0%
Reasoning & Math
GPQA Diamond
85%↑4%
HLE
19%↑2%
LB Reasoning
59%↑0%
LB Math
77%↑4%
LB Data
50%↑0%
Coding
AA Coding Index
37%↑3%
LB Coding
71%↓3%
LB Agentic
52%↑8%
TAU2
95%↑22%
TerminalBench
35%↑4%
SciCode
43%↑2%
Language & Instructions
IFBench
72%↑15%
AA-LCR
66%↑4%
LB Language
55%↓17%
LB IF
57%↑11%
Output Speed
Standard Mode
104tok/s↑22
First Output 20.51s