AI Agent term

Failure Tag

A label that explains what went wrong in an agent output.

Definition

Failure tags make benchmark results easier to audit. Instead of only showing a score, they identify recurring problems such as literal translation, invalid JSON, missing fields, or unsupported claims.

Why it matters

Tags help teams decide where to inspect raw outputs and where to add guardrails.

Example

literal_translation means the output translated words but missed local tone or market convention.