Grok 4 (Reasoning) vs Grok 4.1 (Reasoning)

3 head-to-head matchups on identical queries

8.08
0 Wins 2 Ties 1 Losses
VS
7.68
1 Wins 2 Ties 0 Losses
Metric-by-Metric Comparison