agent-evaluation

supercent-io作者 supercent-io
24
最近更新 2026/3/6

Design and implement comprehensive evaluation systems for AI agents. Use when building evals for coding agents, conversational agents, research agents, or computer-use agents. Covers grader types, benchmarks, 8-step roadmap, and production integration.

agent-evaluationevalsAI-agentsbenchmarksgraderstestingquality-assurance

这个 skill 适合解决什么

Design and implement comprehensive evaluation systems for AI agents. Use when building evals for coding agents, conversational agents, research agents, or computer-use agents. Covers grader types, benchmarks, 8-step roadmap, and production integration.

如果你在找一个围绕 Claude Code Skills、GitHub Skills 展开的 skill,这个页面会帮助你判断它是不是值得进入标准工具箱。

安装位置与仓库信号

仓库路径
.agent-skills/agent-evaluation/
许可证
未声明
下载量
0
最近更新
2026/3/6

常见问题

Loading files...