Nemotron 3 Super is NVIDIA's open hybrid Mamba-Transformer MoE model with 120 billion total parameters, activating just 12 billion for maximum compute efficiency. Its hybrid architecture integrates Mamba layers for sequence efficiency with Transformer layers for precision reasoning, delivering over 5× throughput compared to its predecessor. With a native 1M-token context window and NVFP4 precision optimized for Blackwell GPUs, it scores 85.6% on PinchBench — the best among open models — making it well-suited for complex multi-agent applications, software development, and agentic reasoning.
Reasoning|Open Model
Knowledge Cutoff
2026-02-01
Input → Output Format
Context Memory
262KIN1MOUT
Source:Official Docs
AI Performance Evaluation
Arena Overall Score
1361
±7As of 2026-04-23
Overall Rank
No.142
7,408 Votes
Arena by Ability
Hard Prompts
1381±9No.140
Expert Knowledge
1398±24No.118
Instruction Following
1347±13No.145
Conversation Memory
1349±17No.147
Creative
1301±19No.174
Coding
1408±14No.140
Math
1378±25No.129
Arena by Occupation
Creative Writing
1324±15No.159
Social Sciences
1366±17No.154
Media
1317±17No.151
Business
1349±16No.156
Healthcare
1351±26No.167
Legal
1368±26No.150
Software
1404±11No.137
Mathematics
1398±28No.109
Source:Arena Intelligence
Overall
AA Intelligence Index
36%↓2%
LiveBench
32%↓28%
Reasoning & Math
GPQA Diamond
80%↓1%
HLE
19%↑2%
LB Reasoning
34%↓25%
LB Math
36%↓37%
LB Data
21%↓28%
Coding
AA Coding Index
31%↓3%
LB Coding
54%↓20%
LB Agentic
23%↓20%
TAU2
68%↓5%
TerminalBench
29%↓2%
SciCode
36%↓5%
Language & Instructions
IFBench
72%↑15%
AA-LCR
60%↓2%
LB Language
30%↓42%
LB IF
28%↓18%
Output Speed
Standard Mode
80tok/s↓2
First Output 1.88s
Reasoning Mode
158tok/s↑70
First Output 13.70s