1 / 3
Swipe to compare

GPT Audio is OpenAI's multimodal audio model designed for native speech-to-speech interaction via the Chat Completions API. Unlike traditional voice pipelines that chain separate speech-to-text and text-to-speech models, GPT Audio processes and generates audio directly through a single model, resulting in lower latency, more natural-sounding voices, and better preservation of speech nuances such as tone and emotion.

Author
OpenAIOpenAI
Release Date
2025-08-28
Knowledge Cutoff
2023-10-01
License
Proprietary
I/O Format
Context Length
128K / 16K
API I/O (1M)
$2.5 / $10
How to Use
Output Speed
Arena Overall
Intelligence Index
Coding Index
Math Index
LiveBench
ForecastBench
GPQA Diamond
HLE
MMLU-Pro
AIME 2025
MATH-500
LB Reasoning
LB Math
LB Data Analysis
LiveCodeBench
LB Coding
LB Agentic
TAU2
TerminalBench
SciCode
IFBench
AA-LCR
Hallucination (HHEM)
Factual Consistency (HHEM)
LB Language
LB Instruction Following