1 / 3
Swipe to compare

Nemotron 3 Super is NVIDIA's open hybrid Mamba-Transformer MoE model with 120 billion total parameters, activating just 12 billion for maximum compute efficiency. Its hybrid architecture integrates Mamba layers for sequence efficiency with Transformer layers for precision reasoning, delivering over 5× throughput compared to its predecessor. With a native 1M-token context window and NVFP4 precision optimized for Blackwell GPUs, it scores 85.6% on PinchBench — the best among open models — making it well-suited for complex multi-agent applications, software development, and agentic reasoning.

Author
NVIDIANVIDIA
Release Date
2026-03-11
Knowledge Cutoff
2026-02-01
License
Open Model
I/O Format
Context Length
262K / 1M
API I/O (1M)
$0.09 / $0.45
How to Use
Output Speed
80 tok/s
Arena Overall
1361
Intelligence Index
36.0
Coding Index
31.2
Math Index
LiveBench
32.0
ForecastBench
GPQA Diamond
80.0%
HLE
19.2%
MMLU-Pro
AIME 2025
MATH-500
LB Reasoning
34.4
LB Math
36.4
LB Data Analysis
21.2
LiveCodeBench
LB Coding
54.1
LB Agentic
23.0
TAU2
67.8%
TerminalBench
28.8%
SciCode
36.0%
IFBench
71.5%
AA-LCR
0.6
Hallucination (HHEM)
Factual Consistency (HHEM)
LB Language
30.0
LB Instruction Following
28.4