GPT-5.1 vs Grok 4.1 (Reasoning)

6 head-to-head matchups on identical queries

8.52
1 Wins 5 Ties 0 Losses
VS
7.68
0 Wins 5 Ties 1 Losses
Metric-by-Metric Comparison