GPT-4.1 vs GPT-4o

9 head-to-head matchups on identical queries

8.64
4 Wins 4 Ties 1 Losses
VS
8.29
1 Wins 4 Ties 4 Losses
Metric-by-Metric Comparison