Agent Profiles

Each profile reflects Multilingual Agent Arena #1, not a universal model ranking.

Claude Main

Strong writing and safety boundaries, especially in support tasks.

87
EnglishSupportpremium
too_verboseunsafe_refund_promisehallucinated_signing_date

OpenAI Main

Strong generalist with balanced writing and support safety.

86
EnglishWritingpremium
missed_dependencyunsafe_refund_promisehallucinated_signing_date

Qwen Main

Strong Chinese business language and structured extraction.

84
中文Extractionstandard
literal_translationwrong_intentunnatural_japanese

Gemini Main

Reliable extraction profile with mixed localization performance.

80
EnglishExtractionstandard
literal_translationwrong_date_formatunsafe_refund_promise

DeepSeek Main

Best value profile for structured extraction and classification.

79
中文Extractionlow
weak_ctamissing_fieldhallucinated_issue

Grok Main

Fast outputs with higher variance on business constraints.

75
EnglishWritingstandard
unsafe_refund_promiseunsupported_claiminvalid_json