Google
Google

Gemma 4 31B

2026-04-02

Gemma 4 31B is Google DeepMind's most capable open-weight model, a 30.7-billion-parameter dense multimodal model released under the Apache 2.0 license. It processes text and image inputs with a 256K-token context window, supports configurable thinking/reasoning modes, native function calling, structured JSON output, and over 140 languages. Ranking among the top three open models globally on the Arena AI leaderboard, it matches or exceeds much larger models like Llama 4 and Qwen 3.5 on math, coding, and agent tool use, and can run quantized on consumer GPUs with 24GB of VRAM.

API|VisionReasoning|Open ModelApache 2.0
Knowledge Cutoff
2025-01-01
Input → Output Format
Context Memory
262KIN131KOUT
Cost/1M Words
$0.13IN$0.38OUT
Calculate Cost

AI Performance Evaluation

Arena Overall Score
1451
±8
As of 2026-04-23
Overall Rank
No.32
5,818 Votes
Arena by Ability
Hard Prompts
1474±10No.36
Expert Knowledge
1482±27No.30
Instruction Following
1452±14No.25
Conversation Memory
1461±18No.36
Creative
1422±20No.37
Coding
1498±16No.35
Math
1468±28No.18
Arena by Occupation
Creative Writing
1432±16No.34
Social Sciences
1464±20No.39
Media
1415±18No.42
Business
1443±17No.43
Healthcare
1464±29No.49
Legal
1467±27No.28
Software
1490±12No.31
Mathematics
1471±31No.18
Overall
AA Intelligence Index
39%↑1%
LiveBench
62%↑2%
Reasoning & Math
GPQA Diamond
86%↑5%
HLE
23%↑6%
LB Reasoning
59%↑0%
LB Math
74%↑0%
LB Data
59%↑9%
Coding
AA Coding Index
39%↑5%
LB Coding
60%↓13%
LB Agentic
40%↓3%
TAU2
60%↓13%
TerminalBench
36%↑5%
SciCode
43%↑3%
Language & Instructions
IFBench
76%↑19%
AA-LCR
62%↑0%
Hallucination (HHEM)
7.4%↓3%
Factual (HHEM)
93%↑3%
LB Language
71%↑0%
LB IF
68%↑21%
Output Speed
Standard Mode
14tok/s↓68
First Output 1.21s
Reasoning Mode
35tok/s↓53
First Output 58.31s