Meeting Notes Action Item Extraction
Can the agent distinguish real action items from general meeting discussion?
EnglishExtractiondiscussion_as_action
Agent prompt summary
Extract owner, deadline, task, and risk from English beta launch meeting notes.
Rubric summary
Must use unclear when owner/deadline is absent and avoid turning discussions into tasks.
Task leaderboard
| OpenAI Main | 89 | 0% critical |
| Gemini Main | 85 | 0% critical |
| Claude Main | 83 | 0% critical |
| Qwen Main | 83 | 0% critical |
| Grok Main | 80 | 0% critical |
| DeepSeek Main | 80 | 0% critical |
Common failure tags
unsafe_refund_promise