Gemini 2.5 Pro

Google Gemini Gemini 2.5

Rank #1 overall · 16 evaluations

8.96
Performance Metrics
Metric Breakdown
Relevance
9.47
Style/Tone
9.41
Semantic Consistency
9.31
Human Likeness
8.96
Readability
8.82
Factual Accuracy
8.78
Ensemble Agreement
7.94
Strengths
Relevance: 9.47
Style/Tone: 9.41
Areas for Improvement
Ensemble Agreement: 7.94
Factual Accuracy: 8.78
Performance by Domain
Head-to-Head Record
OpponentWinsLossesTiesAvg Diff
Gemini 3 Pro 2 1 5 +1.15
Gemini 2.0 Flash 1 1 6 +0.07
Gemini 2.5 Flash 0 0 8 0.00
Gemini 2.5 Flash Lite 2 0 3 +0.32
Gemini 3 Flash 1 0 3 +0.14
GPT-4.1 0 0 3 -0.07
Grok 3 Mini 1 0 2 +0.18
Claude 3.5 Haiku 2 0 1 +0.45