Reference library

AI Agent Glossary

A searchable long-term reference for AI agent evaluation, task types, failure tags, and business-safety terms.

An AI system that can follow goals, use context, and complete workflow steps.

A repeatable test set for comparing AI agents on documented tasks.

A failure that would be unsafe, misleading, unusable, or structurally invalid in real work.

Turning unstructured text into reliable fields such as JSON, dates, amounts, and labels.

The ability to avoid unsafe commitments, unsupported claims, and policy violations.

A label that explains what went wrong in an agent output.

A localization failure where the words are translated but the local business tone is wrong.

A structured output that can be parsed by software without repair.

Testing agents across the languages and markets where they will actually be used.

A ranking of agents by score, language, task type, or risk metric.

A group of related evaluation tasks such as support, writing, or extraction.

A support failure where an agent promises refund, credit, or compensation without authority.