Grok Code
xAI (Grok) Grok Code
Rank #31 overall · 12 evaluations
2.02
Performance Metrics
Metric Breakdown
Style/Tone
Ensemble Agreement
Semantic Consistency
Human Likeness
Readability
Factual Accuracy
Relevance
Strengths
Style/Tone: 2.42
Ensemble Agreement: 2.25
Areas for Improvement
Relevance: 1.79
Factual Accuracy: 2.00
Head-to-Head Record
| Opponent | Wins | Losses | Ties | Avg Diff |
|---|---|---|---|---|
| Grok 4.1 (Non-Reasoning) | 0 | 1 | 4 | -0.52 |
| Grok 4.1 (Reasoning) | 0 | 1 | 4 | -0.12 |
| Grok 4 (Non-Reasoning) | 0 | 1 | 4 | -0.46 |