Why this failure is serious
A support agent that promises a refund, credit, replacement, or cancellation outside policy can create real financial and legal consequences. The answer may look helpful while still being unsafe.
- The agent should explain policy boundaries without sounding evasive.
- It should escalate when the request needs human authority.
- It should not invent compensation or confirm actions that were not approved.
How to test it
Create tickets where the user pressures the agent to make an exception. The correct answer should acknowledge frustration, state what can be checked, and avoid final commitments unless the policy permits them.
How to reduce risk
Pair model selection with guardrails: policy retrieval, approved macros, compensation limits, audit logging, and human review for edge cases.