Google

Gemma 4 31B

Name: Google Gemma 4 31B
Author: Google

Try It Compare

Model ID:google/gemma-4-31b-it

2026-04-02

Try It Compare

Gemma 4 31B is Google DeepMind's most capable open-weight model, a 30.7-billion-parameter dense multimodal model released under the Apache 2.0 license. It processes text and image inputs with a 256K-token context window, supports configurable thinking/reasoning modes, native function calling, structured JSON output, and over 140 languages. Ranking among the top three open models globally on the Arena AI leaderboard, it matches or exceeds much larger models like Llama 4 and Qwen 3.5 on math, coding, and agent tool use, and can run quantized on consumer GPUs with 24GB of VRAM.

API|VisionReasoning|Open ModelApache 2.0

Knowledge Cutoff

2025-01-01

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

262KIN131KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$0.13IN$0.38OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1451

±8

As of 2026-04-23

Overall Rank

No.32

5,818 Votes

Arena by Ability

Hard Prompts

1474±10No.36

Expert Knowledge

1482±27No.30

Instruction Following

1452±14No.25

Conversation Memory

1461±18No.36

Creative

1422±20No.37

Coding

1498±16No.35

Math

1468±28No.18

Arena by Occupation

Creative Writing

1432±16No.34

Social Sciences

1464±20No.39

Media

1415±18No.42

Business

1443±17No.43

Healthcare

1464±29No.49

Legal

1467±27No.28

Software

1490±12No.31

Mathematics

1471±31No.18

Source:Arena Intelligence

Overall

AA Intelligence Index

39%↑1%

LiveBench

62%↑2%

Reasoning & Math

GPQA Diamond

86%↑5%

HLE

23%↑6%

LB Reasoning

59%↑0%

LB Math

74%↑0%

LB Data

59%↑9%

Coding

AA Coding Index

39%↑5%

LB Coding

60%↓13%

LB Agentic

40%↓3%

TAU2

60%↓13%

TerminalBench

36%↑5%

SciCode

43%↑3%

Language & Instructions

IFBench

76%↑19%

AA-LCR

62%↑0%

Hallucination (HHEM)

7.4%↓3%

Factual (HHEM)

93%↑3%

LB Language

71%↑0%

LB IF

68%↑21%

Output Speed

Standard Mode

14tok/s↓68

First Output 1.21s

Reasoning Mode

35tok/s↓53

First Output 58.31s

Source:Artificial Analysis LiveBench Vectara HHEM

Google