Agent Evaluation
Methods for evaluating agent performance including LLM-as-Judge patterns, metrics design, and benchmarking
@muratcankoylan
. published 2026/01/10
Implementation
$ mkdir -p ~/.claude/skills/agent-evaluation && curl -L "https://raw.githubusercontent.com/muratcankoylan/Agent-Skills-for-Context-Engineering/main/skills/agent-evaluation/SKILL.md" > ~/.claude/skills/agent-evaluation/SKILL.mdBash / Zsh
File Explorer
No file selected
Select a file to view its content
