GPT-5 (Generic)

OpenAI GPT-5

Rank #20 overall · 41 evaluations

7.72
Performance Metrics
Metric Breakdown
Relevance
8.37
Semantic Consistency
8.17
Human Likeness
8.00
Readability
7.87
Style/Tone
7.87
Factual Accuracy
7.21
Ensemble Agreement
6.37
Strengths
Relevance: 8.37
Semantic Consistency: 8.17
Areas for Improvement
Ensemble Agreement: 6.37
Factual Accuracy: 7.21
Performance by Domain
Head-to-Head Record
OpponentWinsLossesTiesAvg Diff
Grok 4 (Reasoning) 27 5 4 -0.26
Claude 3.7 Sonnet 21 5 5 -0.52
Sonar Pro 19 7 5 -0.19
GPT-5 0 0 4 +0.04