Agent Evaluation

Methods for evaluating agent performance including LLM-as-Judge patterns, metrics design, and benchmarking

. published 2026/01/10

Implementation

$ mkdir -p ~/.claude/skills/agent-evaluation && curl -L "https://raw.githubusercontent.com/muratcankoylan/Agent-Skills-for-Context-Engineering/main/skills/agent-evaluation/SKILL.md" > ~/.claude/skills/agent-evaluation/SKILL.md

Bash / Zsh

File Explorer

0 files

No file selected

readonly

Select a file to view its content

Resources

Repository