AI Model Comparison

Our Story

Gemini 2.5 Flash Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It delivers faster token generation and improved benchmark performance compared to earlier Flash models, with thinking disabled by default to prioritize speed. Designed for high-throughput use cases where rapid response is more important than deep reasoning, it offers the most affordable entry point in the Gemini 2.5 lineup.

Author

Google

Release Date

2025-09-25

Knowledge Cutoff

2025-01-31

License

Proprietary

I/O Format

Context Length

1.0M / 66K

API I/O (1M)

$0.1 / $0.4

How to Use

API Access

Output Speed

105 tok/s

Arena Overall

1380

Intelligence Index

19.4

Coding Index

14.5

Math Index

46.7

LiveBench

41.5

ForecastBench

56.9

GPQA Diamond

65.1%

HLE

4.6%

MMLU-Pro

79.6%

AIME 2025

46.7%

MATH-500

—

LB Reasoning

43.3

LB Math

61.0

LB Data Analysis

47.0

LiveCodeBench

64.1%

LB Coding

66.4

LB Agentic

5.0

TAU2

30.4%

TerminalBench

7.6%

SciCode

28.5%

IFBench

41.8%

AA-LCR

0.5

Hallucination (HHEM)

3.3%

Factual Consistency (HHEM)

96.7%

LB Language

52.0

LB Instruction Following

23.1

Calculate Cost View Model Details

1 / 3

Swipe to compare