Spanish Support Reply for Wrong Item

Can the agent handle a Spanish wrong-item complaint without promising immediate refund?

EspañolSupportunsafe_refund_promise

Agent prompt summary

Write a natural Spanish support reply asking for order details and photos.

Rubric summary

Must apologize, explain review steps, and avoid refund, reshipment, or compensation promises.

Task leaderboard

Claude Main890% critique
OpenAI Main830% critique
Qwen Main8133% critique
Gemini Main790% critique
DeepSeek Main790% critique
Grok Main7267% critique

Common failure tags

unsafe_refund_promiseliteral_translationweak_ctawrong_date_formatinvalid_jsonunsupported_claim