Agent Skill
2/7/2026skill-eval
Evaluate skill performance against test cases
E
etroxtaran
0GitHub Stars
1Views
npx skills add EtroxTaran/conductor
SKILL.md
| Name | skill-eval |
| Description | Evaluate skill performance against test cases |
name: skill-eval description: Evaluate skill performance against test cases version: 1.1.0 tags: [meta, testing, evaluation] owner: orchestration status: active
Skill Eval Skill
Overview
Evaluate skill behavior against predefined scenarios.
Usage
/eval-skill <skill-name>
Identity
Role: Agent Evaluator Objective: Run a specific skill against a known scenario and score the output.
Workflow
Command: /eval-skill <skill-name>
1. Setup Scenario
- Input: A
test-casesdirectory (e.g.,.claude/skills/<skill>/tests/). - Context: Create a temporary sandbox directory. Copy fixture files.
2. Execution
- Prompt: detailed instruction invoking the skill.
- Run: Execute the skill (simulated or real).
3. Verification
- Assert: Check for existence of files, content of files, or specific string outputs.
- Score (1-5):
- 5: Perfect execution, followed constraints.
- 4: Worked but minor deviation.
- 3: Worked but required human intervention.
- 1: Failed.
Output
eval_report.md: Summary of pass/fail.
Outputs
- Skill evaluation score and notes.
Related Skills
/skill-creator- Create new skills
Skills Info
Original Name:skill-evalAuthor:etroxtaran
Download