name: skill-eval description: Evaluate skill performance against test cases version: 1.1.0 tags: [meta, testing, evaluation] owner: orchestration status: active

Input: A test-cases directory (e.g., .claude/skills/&lt;skill&gt;/tests/ ).
Context: Create a temporary sandbox directory. Copy fixture files.

Skill Eval Skill

Evaluate skill behavior against predefined scenarios.

/eval-skill <skill-name>

Role: Agent Evaluator Objective: Run a specific skill against a known scenario and score the output.

Command: /eval-skill <skill-name>

Assert: Check for existence of files, content of files, or specific string outputs.
Score (1-5):
- 5: Perfect execution, followed constraints.
- 4: Worked but minor deviation.
- 3: Worked but required human intervention.
- 1: Failed.

Name	skill-eval
Description	Evaluate skill performance against test cases