Agent Skill
2/7/2026skill-eval
Evaluate all workspace-hub skills for structural validity, content quality, cross-reference integrity, and registry consistency. Runs 18 checks across critical, warning, and info severity levels with actionable fix suggestions.
V
vamseeachanta
3GitHub Stars
1Views
npx skills add vamseeachanta/workspace-hub
SKILL.md
| Name | skill-eval |
| Description | Evaluate all workspace-hub skills for structural validity, content quality, cross-reference integrity, and registry consistency. Runs 18 checks across critical, warning, and info severity levels with actionable fix suggestions. |
name: skill-eval description: Evaluate all workspace-hub skills for structural validity, content quality, cross-reference integrity, and registry consistency. Runs 18 checks across critical, warning, and info severity levels with actionable fix suggestions. version: 1.0.0 category: development last_updated: 2026-01-29 capabilities:
- structural_validation
- content_quality_analysis
- cross_reference_integrity
- report_generation tools:
- Bash
- Read
- Glob
- Grep related_skills:
- skill-creator
- compliance-check
- repository-health-analyzer
- verification-loop requires: [] see_also: []
Skill Eval
Quick Start
# Full evaluation of all skills
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py
# JSON output
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --format json
# Single skill
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --skill testing-tdd-london
# Only critical issues
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --severity critical
When to Use
- Auditing skill quality after bulk creation or migration
- Pre-release validation of the skills library
- CI/CD quality gate for skill changes
- Identifying broken cross-references after renaming skills
- Checking compliance with v2 SKILL.md template format
Core Concepts
Evaluation Dimensions
The evaluator runs 18 checks organized into three severity levels:
Critical (blocks skill usage):
- YAML frontmatter exists and parses
- Required fields present:
name,description
Warning (degrades quality):
- Version follows semver, category matches directory
- Required content sections present (Quick Start, When to Use, etc.)
- Code blocks in key sections, no TODO/FIXME markers
- Cross-references in
related_skillsresolve to real skills
Info (improvement opportunities):
- Uses v2 template format, has optional sections (Metrics, etc.)
- No duplicate skill names across the library
Report Output
Reports include:
- Summary with pass/fail counts and percentages
- Issues grouped by severity with counts
- Per-category breakdown
- Top 10 most common issues
- Per-skill details with actionable fix suggestions
Usage Examples
Full Evaluation
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py
Output:
================================================================
SKILL EVALUATION REPORT
Generated: 2026-01-29T14:30:00+00:00
================================================================
SUMMARY
----------------------------------------
Total skills evaluated: 230
Passed (no critical): 187 (81.3%)
Warnings only: 102 (44.3%)
Critical failures: 43 (18.7%)
JSON for CI/CD
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py \
--format json --severity critical \
--output reports/skill-eval.json
Filter by Category
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --category development
Summary Only
uv run .claude/skills/development/skill-eval/scripts/eval-skills.py --summary-only
Best Practices
- Run after creating new skills with
/skill-creatorto validate structure - Use
--format jsonin CI pipelines for machine-readable output - Address critical issues first (missing frontmatter, invalid YAML)
- Use
--severity warningto focus on actionable improvements - Run
--categoryfilters for focused audits of specific skill areas - If skills or index markdown changes, regenerate derived skills summary artifacts:
uv run --no-project python scripts/skills/generate-skill-summary.py
Error Handling
| Exit Code | Meaning |
|---|---|
| 0 | All skills pass (no critical issues) |
| 1 | Critical failures found |
| 2 | Script error (missing directory, invalid arguments) |
| Common Issue | Cause | Fix |
|---|---|---|
frontmatter_missing | Skill uses legacy heading format | Add --- delimited YAML frontmatter |
yaml_invalid | Syntax error in frontmatter | Fix YAML syntax (check colons, indentation) |
related_skill_unresolved | Referenced skill name doesn't exist | Correct the name or remove the reference |
section_missing | Missing required H2 section | Add the section heading and content |
Metrics & Success Criteria
| Metric | Target |
|---|---|
| Skills passing all critical checks | 100% |
| Skills with complete v2 sections | >80% |
| Resolved cross-references | 100% |
| No TODO/FIXME in skills | 100% |
Version History
- 1.0.0 (2026-01-29): Initial release with 18 checks, human + JSON output, category/skill filtering
Skills Info
Original Name:skill-evalAuthor:vamseeachanta
Download