Grok 3 Mini
xAI (Grok) Grok 3
Rank #19 overall · 14 evaluations
7.76
Performance Metrics
Metric Breakdown
Style/Tone
Relevance
Semantic Consistency
Readability
Human Likeness
Factual Accuracy
Ensemble Agreement
Strengths
Style/Tone: 8.92
Relevance: 8.29
Areas for Improvement
Ensemble Agreement: 7.15
Factual Accuracy: 7.50
Performance by Domain
Head-to-Head Record
| Opponent | Wins | Losses | Ties | Avg Diff |
|---|---|---|---|---|
| Claude Sonnet 4.5 | 1 | 0 | 8 | +0.05 |
| Gemini 3 Flash | 2 | 1 | 6 | -0.69 |
| GPT-5.2 | 2 | 1 | 6 | -0.65 |
| Gemini 2.5 Pro | 0 | 1 | 2 | -0.18 |
| GPT-4.1 | 0 | 2 | 1 | -0.25 |
| Claude 3.5 Haiku | 2 | 0 | 1 | +0.28 |