LiveBench推論カテゴリスコア(0〜100)。論理パズル、空間推論、因果関係分析などを評価します。
Anthropic
Claude Opus 4.6
OpenAI
GPT-5.4
GPT-5.5
Google
Gemini 3.1 Pro
GPT-5
Grok
Grok 4.1 Fast (Reasoning)
Moonshot AI
Kimi K2.6
Claude Sonnet 4.6
Kimi K2.5
Alibaba
Qwen3.6 Plus
Grok 4.20 (Reasoning)
Claude Opus 4.7
MiniMax
MiniMax M2.7
Z.ai
GLM-5.1
Claude Opus 4.1
Gemini 2.5 Pro
Xiaomi
MiMo-V2-Pro
GLM-5
Claude Sonnet 4
Gemini 3.1 Flash Lite
Gemma 4 31B
MiniMax M2.5
GPT-5 Mini
Gemini 3 Flash
Claude Opus 4.5
Gemini 2.5 Flash
DeepSeek
DeepSeek V3.2
Gemini 2.5 Flash Lite
Claude Sonnet 4.5
GPT OSS 120B
GPT-5 Nano
NVIDIA
Nemotron 3 Super
Claude Haiku 4.5
Grok 4.20
Grok 4.1 Fast
GPT-5.4 Mini
Arcee AI
Trinity Large Thinking
GPT-5.4 Nano