1 / 3
Swipe to compare

Voxtral Mini TTS is Mistral's first text-to-speech model, released March 2026 as the generative counterpart to the Voxtral speech-recognition family. It is a ~4B-parameter model designed for low-latency voice agents and streaming applications, with a 4,096-token context window and raw-audio output. Voxtral supports nine languages — English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic — and performs zero-shot voice cloning from as little as three seconds of reference audio, preserving intonation, rhythm, and emotional delivery without explicit prosody tags. In head-to-head testing it wins 68.4% of preference votes against ElevenLabs Flash v2.5.

Author
Mistral AIMistral AI
Release Date
Knowledge Cutoff
License
Proprietary
I/O Format
Context Length
API I/O (1M)
How to Use
Output Speed
Arena Overall
Intelligence Index
Coding Index
Math Index
LiveBench
ForecastBench
GPQA Diamond
HLE
MMLU-Pro
AIME 2025
MATH-500
LB Reasoning
LB Math
LB Data Analysis
LiveCodeBench
LB Coding
LB Agentic
TAU2
TerminalBench
SciCode
IFBench
AA-LCR
Hallucination (HHEM)
Factual Consistency (HHEM)
LB Language
LB Instruction Following