여러 차례 대화가 이어지는 멀티턴 프롬프트에서의 사용자 선호도 기반 Arena Elo 점수입니다.
Anthropic
Claude Opus 4.7
Claude Opus 4.6
Meta
Muse Spark
Google
Gemini 3.1 Pro
OpenAI
GPT-5.4
GPT-5.4 Pro
Grok
Grok 4.20
Grok 4.20 (Reasoning)
Claude Opus 4.5
Gemini 3 Flash
Claude Sonnet 4.6
DeepSeek
DeepSeek V4 Pro
Z.ai
GLM-5.1
GPT-5.4 Mini
Claude Opus 4.1
Claude Sonnet 4.5
GLM-5
Xiaomi
MiMo-V2-Pro
Gemma 4 31B
Moonshot AI
Kimi K2.6
Kimi K2.5
Gemini 2.5 Pro
Alibaba
Qwen3.5 397B A17B
Gemini 3.1 Flash Lite
Baidu
ERNIE 5.0 Thinking
DeepSeek V4 Flash
Claude Opus 4
Qwen3.6 Plus
DeepSeek V3.2
Claude Haiku 4.5
GPT-5
Claude Sonnet 4
Grok 4.1 Fast
Grok 4.1 Fast (Reasoning)
Meituan
Longcat Flash Chat
GPT-5.4 Nano
MiniMax
MiniMax M2.5
MiniMax M2.7
Gemini 2.5 Flash
Gemini 2.5 Flash Lite
Arcee AI
Trinity Large Thinking
GPT-5 Mini
NVIDIA
Nemotron 3 Super
GPT OSS 120B
Llama 4 Maverick
Amazon
Nova 2 Lite
GPT-5 Nano
Llama 4 Scout
GPT-4.1