GPT-4.1 Nano
OpenAI GPT-4
Rank #4 overall · 21 evaluations
8.74
Performance Metrics
Metric Breakdown
Relevance
Semantic Consistency
Style/Tone
Human Likeness
Readability
Ensemble Agreement
Factual Accuracy
Strengths
Relevance: 9.45
Semantic Consistency: 9.10
Areas for Improvement
Factual Accuracy: 7.67
Ensemble Agreement: 7.69
Performance by Domain
Head-to-Head Record
| Opponent | Wins | Losses | Ties | Avg Diff |
|---|---|---|---|---|
| Gemini 3 Flash | 4 | 0 | 8 | +0.50 |
| Sonar | 6 | 0 | 5 | +1.21 |
| GPT-4.1 | 1 | 1 | 8 | -0.15 |
| GPT-4o | 5 | 1 | 3 | +0.28 |
| GPT-5 Mini | 1 | 4 | 4 | +0.09 |
| Jamba Large | 2 | 0 | 3 | +0.17 |