GPT-5.2

OpenAI GPT-5

Rank #5 overall · 62 evaluations

8.71
Performance Metrics
Metric Breakdown
Relevance
9.48
Style/Tone
9.02
Semantic Consistency
8.96
Human Likeness
8.74
Factual Accuracy
8.54
Readability
8.50
Ensemble Agreement
7.45
Strengths
Relevance: 9.48
Style/Tone: 9.02
Areas for Improvement
Ensemble Agreement: 7.45
Readability: 8.50
Performance by Domain
Head-to-Head Record
OpponentWinsLossesTiesAvg Diff
Claude Sonnet 4.5 13 1 20 +0.47
Gemini 3 Flash 4 2 13 -0.42
Grok 4.1 (Reasoning) 9 0 7 +0.60
Gemini 3 Pro 7 0 5 +1.14
Grok 3 Mini 1 2 6 +0.65
GPT-5.2 (Thinking) 1 1 1 -2.88