AI Model Comparison

Our Story

Gemini 2.5 Flash TTS is Google's text-to-speech model built on the Gemini 2.5 Flash architecture, designed for real-time voice assistants, high-volume narration, and conversational applications. It supports 24 languages with fine-grained control over voice style and pacing, and can maintain consistent character voices across multi-speaker scenarios. The model features enhanced expressivity that aligns with style prompts and adjusts pacing based on context, making it well-suited for interactive voice agents and dynamic audio content production.

Author

Google

Release Date

2025-12-10

Knowledge Cutoff

Unknown

License

Proprietary

I/O Format

Context Length

8K / 16K

API I/O (1M)

$0.3 / $2.5

How to Use

API Access

Output Speed

—

Arena Overall

—

Intelligence Index

—

Coding Index

—

Math Index

—

LiveBench

—

ForecastBench

—

GPQA Diamond

—

HLE

—

MMLU-Pro

—

AIME 2025

—

MATH-500

—

LB Reasoning

—

LB Math

—

LB Data Analysis

—

LiveCodeBench

—

LB Coding

—

LB Agentic

—

TAU2

—

TerminalBench

—

SciCode

—

IFBench

—

AA-LCR

—

Hallucination (HHEM)

—

Factual Consistency (HHEM)

—

LB Language

—

LB Instruction Following

—

Calculate Cost View Model Details

1 / 3

Swipe to compare