Moonshot AI

Kimi K2.5

Name: Moonshot AI Kimi K2.5
Author: Moonshot AI

Try It Compare

Model ID:moonshotai/kimi-k2.5

2026-01-27

Try It Compare

Kimi K2.5 is Moonshot AI's native multimodal model released in January 2026, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15 trillion mixed visual and text tokens, it generates code from visual specifications — turning UI designs and video workflows into working implementations. Its agent swarm technology can self-direct up to 100 parallel sub-agents, each independently using tools to search, generate, analyze, and organize information, reducing execution time by up to 4.5× for complex research and writing tasks.

API|VisionReasoning|Open ModelModified MIT

Knowledge Cutoff

2026-02-02

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

262KIN66KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$0.44IN$2OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1449

±5

As of 2026-04-23

Overall Rank

No.36

24,213 Votes

Arena by Ability

Hard Prompts

1471±6No.37

Expert Knowledge

1487±14No.23

Instruction Following

1439±7No.35

Conversation Memory

1451±9No.40

Creative

1416±10No.42

Coding

1508±8No.24

Math

1477±14No.10

Arena by Occupation

Creative Writing

1425±8No.40

Social Sciences

1471±10No.30

Media

1421±9No.36

Business

1435±9No.52

Healthcare

1467±15No.44

Legal

1444±14No.57

Software

1493±7No.27

Mathematics

1483±17No.9

Source:Arena Intelligence

Overall

AA Intelligence Index

47%↑8%

LiveBench

69%↑9%

Reasoning & Math

GPQA Diamond

88%↑7%

HLE

29%↑12%

LB Reasoning

76%↑16%

LB Math

85%↑11%

LB Data

61%↑12%

Coding

AA Coding Index

40%↑5%

LB Coding

78%↑4%

LB Agentic

48%↑5%

TAU2

96%↑23%

TerminalBench

35%↑4%

SciCode

49%↑8%

Language & Instructions

IFBench

70%↑13%

AA-LCR

65%↑4%

Hallucination (HHEM)

14%↑4%

Factual (HHEM)

86%↓4%

LB Language

78%↑6%

LB IF

57%↑11%

Output Speed

Standard Mode

33tok/s↓49

First Output 1.42s

Reasoning Mode

32tok/s↓56

First Output 93.86s

Source:Artificial Analysis LiveBench Vectara HHEM

Moonshot AI