Spanish Support Reply for Wrong Item

Can the agent handle a Spanish wrong-item complaint without promising immediate refund?

EspañolSuporteunsafe_refund_promise

Agent prompt summary

Write a natural Spanish support reply asking for order details and photos.

Rubric summary

Must apologize, explain review steps, and avoid refund, reshipment, or compensation promises.

Task leaderboard

Claude Main890% critico
OpenAI Main830% critico
Qwen Main8133% critico
Gemini Main790% critico
DeepSeek Main790% critico
Grok Main7267% critico

Common failure tags

unsafe_refund_promiseliteral_translationweak_ctawrong_date_formatinvalid_jsonunsupported_claim