GPT-4.1 Nano

OpenAI GPT-4

Rank #4 overall · 21 evaluations

8.74
Performance Metrics
Metric Breakdown
Relevance
9.45
Semantic Consistency
9.10
Style/Tone
9.07
Human Likeness
8.82
Readability
8.71
Ensemble Agreement
7.69
Factual Accuracy
7.67
Strengths
Relevance: 9.45
Semantic Consistency: 9.10
Areas for Improvement
Factual Accuracy: 7.67
Ensemble Agreement: 7.69
Performance by Domain
Head-to-Head Record
OpponentWinsLossesTiesAvg Diff
Gemini 3 Flash 4 0 8 +0.50
Sonar 6 0 5 +1.21
GPT-4.1 1 1 8 -0.15
GPT-4o 5 1 3 +0.28
GPT-5 Mini 1 4 4 +0.09
Jamba Large 2 0 3 +0.17