OpenAI
OpenAI

GPT-4o Mini TTS

2025-12-15

GPT-4o Mini TTS is OpenAI's cost-efficient text-to-speech model, built on the GPT-4o Mini architecture. It converts written text into natural-sounding, expressive spoken audio with high steerability — developers can control speech characteristics such as tone, pacing, and emphasis through natural language instructions in the prompt. Compared to previous OpenAI TTS models, it delivers significantly lower word error rates and more natural prosody, making it ideal for voice agents, accessibility features, and audio content production at scale.

API|Proprietary Model
Knowledge Cutoff
Unknown
Input → Output Format
Context Memory
2K
Cost/1M Words
$0.6IN$12OUT
Calculate Cost

AI Performance Evaluation

Language & Instructions
Hallucination (HHEM)
9.6%↓1%
Factual (HHEM)
90%↑1%