Claude Sonnet 4.5 (Thinking) vs Grok 4.1 (Reasoning)

12 head-to-head matchups on identical queries

7.60
3 Wins 6 Ties 3 Losses
VS
7.68
3 Wins 6 Ties 3 Losses
Metric-by-Metric Comparison