Claude Opus 4.6 (Adaptive)

Anthropic Claude Opus

Rank #16 overall · 117 evaluations

8.16
Performance Metrics
Metric Breakdown
Relevance
8.95
Semantic Consistency
8.38
Style/Tone
8.34
Human Likeness
8.33
Readability
7.71
Factual Accuracy
7.57
Ensemble Agreement
5.47
Strengths
Relevance: 8.95
Semantic Consistency: 8.38
Areas for Improvement
Ensemble Agreement: 5.47
Factual Accuracy: 7.57
Performance by Domain
Head-to-Head Record
OpponentWinsLossesTiesAvg Diff
Gemini 3 Pro 5 0 4 +2.67
Claude Sonnet 4.5 (Thinking) 0 0 7 +0.05
Claude Sonnet 4 0 0 4 +0.06
Claude Sonnet 4.5 0 0 4 +0.06