Claude Sonnet 4.5 vs GPT-5.1 (Thinking)

24 head-to-head matchups on identical queries

8.40
2 Wins 14 Ties 8 Losses
VS
8.37
8 Wins 14 Ties 2 Losses
Metric-by-Metric Comparison