Claude Opus 4.6 (Adaptive)
Anthropic Claude Opus
Rank #16 overall · 117 evaluations
8.16
Performance Metrics
Metric Breakdown
Relevance
Semantic Consistency
Style/Tone
Human Likeness
Readability
Factual Accuracy
Ensemble Agreement
Strengths
Relevance: 8.95
Semantic Consistency: 8.38
Areas for Improvement
Ensemble Agreement: 5.47
Factual Accuracy: 7.57
Performance by Domain
Head-to-Head Record
| Opponent | Wins | Losses | Ties | Avg Diff |
|---|---|---|---|---|
| Gemini 3 Pro | 5 | 0 | 4 | +2.67 |
| Claude Sonnet 4.5 (Thinking) | 0 | 0 | 7 | +0.05 |
| Claude Sonnet 4 | 0 | 0 | 4 | +0.06 |
| Claude Sonnet 4.5 | 0 | 0 | 4 | +0.06 |