GPT-5 Mini
OpenAI GPT-5
Rank #3 overall · 26 evaluations
8.80
Performance Metrics
Metric Breakdown
Relevance
Style/Tone
Semantic Consistency
Factual Accuracy
Human Likeness
Readability
Ensemble Agreement
Strengths
Relevance: 9.31
Style/Tone: 9.23
Areas for Improvement
Ensemble Agreement: 7.57
Readability: 8.52
Performance by Domain
Head-to-Head Record
| Opponent | Wins | Losses | Ties | Avg Diff |
|---|---|---|---|---|
| GPT-4o | 5 | 1 | 5 | +0.15 |
| Gemini 2.5 Flash | 1 | 1 | 8 | +0.04 |
| Mistral Small 3.2 | 2 | 0 | 7 | +0.22 |
| Jamba Mini | 4 | 0 | 5 | +3.10 |
| GPT-4.1 | 3 | 2 | 4 | -0.26 |
| GPT-4.1 Nano | 4 | 1 | 4 | -0.09 |