Claude Main
Based on the current Arena #2 preview average score.
Compare two strong generalist agents across overall score, pass rate, critical failures, language strengths, and cost tier.
Use case: General writing, support, and high-quality multilingual workflows
Based on the current Arena #2 preview average score.
Sorted by critical-failure rate, not a universal safety guarantee.
Prioritizes cost tier, then score.
| Metric | OpenAI Main | Claude Main |
|---|---|---|
| Overall | 86 | 87 |
| Pass rate | 92% | 97% |
| Critical | 12% | 12% |
| Format pass | 100% | 100% |
| Win rate | 30% | 55% |
| Cost tier | premium | premium |