Grok Code

xAI (Grok) Grok Code

Rank #31 overall · 12 evaluations

2.02
Performance Metrics
Metric Breakdown
Style/Tone
2.42
Ensemble Agreement
2.25
Semantic Consistency
2.21
Human Likeness
2.13
Readability
2.08
Factual Accuracy
2.00
Relevance
1.79
Strengths
Style/Tone: 2.42
Ensemble Agreement: 2.25
Areas for Improvement
Relevance: 1.79
Factual Accuracy: 2.00
Head-to-Head Record
OpponentWinsLossesTiesAvg Diff
Grok 4.1 (Non-Reasoning) 0 1 4 -0.52
Grok 4.1 (Reasoning) 0 1 4 -0.12
Grok 4 (Non-Reasoning) 0 1 4 -0.46