Definition
Task families help readers compare agents by the kind of work they need. An agent that wins writing may not be the safest support agent or the best extraction agent.
A group of related evaluation tasks such as support, writing, or extraction.
Task families help readers compare agents by the kind of work they need. An agent that wins writing may not be the safest support agent or the best extraction agent.
Buying by task family prevents teams from choosing a strong generalist for the wrong workflow.
Support tasks test policy boundaries, while extraction tasks test schema reliability.