Agent Skill
2/7/2026

recon-workbench

Run authorized, evidence-backed Recon Workbench (rwb) workflows (doctor/authorize/plan/run/summarize/manifest/validate/reconcile) and produce evidence-cited findings. Use when interrogating macOS/iOS, web/React, or OSS targets under explicit scope/permission.

J
jscraik
2GitHub Stars
1Views
npx skills add jscraik/Agent-Skills

SKILL.md

Namerecon-workbench
DescriptionRun authorized, evidence-backed Recon Workbench (rwb) workflows (doctor/authorize/plan/run/summarize/manifest/validate/reconcile) and produce evidence-cited findings. Use when interrogating macOS/iOS, web/React, or OSS targets under explicit scope/permission.

name: recon-workbench description: Run authorized, evidence-backed Recon Workbench (rwb) workflows (doctor/authorize/plan/run/summarize/manifest/validate/reconcile) and produce evidence-cited findings. Use when interrogating macOS/iOS, web/React, or OSS targets under explicit scope/permission.

Recon Workbench (rwb)

Recon Workbench is a CLI-first interrogation platform. It is designed to produce evidence-backed findings under explicit authorization, with deterministic artifacts and validation.

When you respond while this skill is active, answer with sections titled exactly: Outputs and Procedure (and include authorization notes).

Scope and triggers

  • Running rwb CLI flows (doctor, authorize, plan, run, summarize, manifest, validate, reconcile).
  • Designing/updating probe catalogs, schemas, or validation scripts.
  • Producing evidence-backed findings/reports with artifact citations (no speculation).
  • If the target is a web app and source is unavailable/minified, use web-app-interrogate.

Philosophy

  • Evidence over inference: if you can’t cite an artifact, label it a hypothesis.
  • Least privilege: start static/read-only, escalate only when justified and allowed.
  • Safety first: stop on unclear authorization, scope violations, or circumvention pressure.

Constraints (non-negotiable)

  • Authorization required before any target-specific interrogation.
  • Evidence-only claims: every claim must cite an artifact path under runs/....
  • No circumvention: no DRM bypass, no cracking, no private user data access.
  • Least privilege: start read-only; escalate only when justified and permitted.
  • Redact by default: scrub secrets from logs, HARs, screenshots, and reports.

Compliance (Recon Workbench repo)

When operating inside ~/dev/recon-workbench, follow:

  • docs/agents/* (start at docs/agents/cli.md)
  • docs/reference/GOLD_STANDARD.md
  • docs/reference/AUTHORIZATION_CHECKLIST.md
  • docs/reference/DATA_HANDLING.md

Entrypoints (Recon Workbench repo)

Prefer running inside the repo via mise to ensure uv, Python, and Node are correct:

mise exec -- uv run python -m rwb <command> [args...]

If uv is already on PATH, uv run python -m rwb ... is equivalent.

Secondary/legacy wrapper:

  • ./recon <command> (shell wrapper around scripts/recon_cli.py)

Target kinds

KindDescriptionExample Locator
macos-appmacOS applications/Applications/MyApp.app
ios-simiOS Simulator appscom.example.MyApp
ios-deviceiOS device appscom.example.MyApp
web-appWeb applicationshttps://example.com
oss-repoOpen source repositoriesowner/repo or git URL

Probe Sets

Predefined probe sets live in probes/catalog.json. Common sets (not exhaustive):

  • macos-baseline, macos-objc-static, macos-debug, macos-accessibility
  • ios-baseline, ios-objc-static, ios-debug, ios-smoke
  • ios-diagnose, ios-device-diagnose, ios-sim-diagnose-pack, ios-device-diagnose-pack, diagnose-pack
  • web-baseline, web-stimulus
  • oss-baseline, oss-full

Escalation Levels

  • read_only: Static analysis, no code execution
  • instrumentation: Log capture, tracing, non-invasive monitoring
  • escalation: Debug builds, LLDB, dynamic analysis (requires explicit authorization)

Scope Configuration

Create scope.yaml to set organizational defaults:

# Disallow dangerous probes
disallowed_probes:
  - "debug.lldb_backtrace"

# Limit escalation level
max_escalation_level: "instrumentation"  # read_only < instrumentation < escalation

# Require authorization
require_authorization: true

Required inputs

Cognitive Support / Plain-Language

  • Optimize for low cognitive load (TBI support): one task at a time, explicit steps.

  • Use plain language first; define jargon in parentheses.

  • Keep steps short and checklist-driven where possible.

  • Externalize state: decisions, assumptions, and the next step.

  • Provide ELI5 explanations for non-trivial logic.

  • Ask one question at a time; prefer multiple-choice when possible.

  • target_id: Unique identifier for the target

  • target_kind: One of macos-app, ios-sim, ios-device, web-app, oss-repo

  • target_locator: Path, URL, bundle ID, or repo identifier

  • probe_set or probes: Predefined probe set or custom probe list

  • authorization: Authorization artifact (required when scope enforces authorization)

  • run_dir: Output directory for artifacts (use runs/<target_id>/...)

Deliverables

Structure: runs/<target>/<session>/<run>/

  • raw/ - Probe artifacts (logs, dumps, traces, HARs)
  • manifest.json - SHA256 hashes for integrity verification
  • derived/findings.json - Schema-valid findings with evidence citations; include schema_version
  • derived/report.md - Human-readable summary with artifact paths
  • derived/report.json - Machine-readable report (when generated); include schema_version when schema-bound

Schema source of truth in the repo: config/schemas/ (e.g. config/schemas/findings.v2.schema.json).

Procedure

1) Check Toolchain

mise exec -- uv run python -m rwb doctor --json

2) Create Authorization (required by scope)

mise exec -- uv run python -m rwb authorize \
  --target-id myapp \
  --target-kind macos-app \
  --target-locator "/Applications/MyApp.app" \
  --output authorization.json

3) Generate Probe Plan

mise exec -- uv run python -m rwb plan \
  --target-id myapp \
  --target-kind macos-app \
  --target-locator "/Applications/MyApp.app" \
  --probe-set macos-baseline \
  --authorization authorization.json

4) Execute Probes

mise exec -- uv run python -m rwb run \
  --plan-file probe-plan.json \
  --run-dir runs/myapp/

5) Generate Findings + Report

mise exec -- uv run python -m rwb summarize \
  --run-dir runs/myapp/

6) Validate Artifacts (CLI-first)

mise exec -- uv run python -m rwb validate \
  --catalog probes/catalog.json \
  --plan probe-plan.json

mise exec -- uv run python -m rwb validate \
  --evidence runs/myapp/derived/findings.json \
  --run-dir runs/myapp

Validation

Fail fast: stop at the first failed validation gate, fix it, then re-run the same gate.

# Validate probe catalog
python scripts/validate_catalog.py --catalog probes/catalog.json

# Validate a manifest
python scripts/validate_manifest.py runs/myapp/manifest.json

# Validate evidence paths in findings
python scripts/validate_evidence.py runs/myapp/derived/findings.json runs/myapp/

Escalation Ladder (Worst-Case Path)

  1. Static inventory (safe, read-only)
  2. Baseline run (minimal interaction)
  3. Stimulus run (targeted action)
  4. Diff (baseline vs stimulus)
  5. Advanced observation (approved tools only; stop if protections block)

Stop conditions:

  • Goals are met with evidence
  • Further steps require circumvention or exceed authorization
  • Signals flatten (no new findings across two successive probes)

Evidence Discipline

  • Every finding must cite one or more evidence paths
  • Summaries must list commands used + artifact locations
  • If evidence is insufficient, request additional probes rather than speculating
  • Redact HAR files before sharing and record redaction in the report
  • Use manifest.json to verify artifact integrity in data/runs/... (or legacy runs/...)

Build Mode (Tooling Design)

When creating or evolving the workbench:

  • Design schemas in config/schemas/ with JSON Schema validation
  • Add probes to probes/catalog.json (alias to config/probes/catalog.json) with target kinds and timeouts
  • Implement probe scripts in scripts/probes/
  • Define probe sets for common workflows
  • Update AGENTS.md with agent instructions
  • Add validation scripts to scripts/validate_*.py

Inspect Mode (Evidence Collection)

When analyzing a target:

  1. Confirm authorization and target type
  2. Select appropriate probe set
  3. Execute probes and collect artifacts
  4. Generate findings with evidence citations
  5. Validate all artifacts and evidence paths
  6. Produce report with artifact links

Variation Rules

  • Vary probe depth by authorization level and target risk profile
  • Vary artifact collection based on target type (Apple, web, OSS) and goal
  • Avoid repeating the same probe sequence across unrelated targets
  • Prefer different variations when signals flatten

Empowerment Principles

  • Operators: Explicit stop conditions and safe rollback options
  • Teams: Multiple probe paths when trade-offs exist
  • Stakeholders: Clear evidence links and decision-ready summaries
  • Reviewers: Direct artifact pointers for verification

Anti-Patterns to Avoid

  • Acting without explicit authorization or documented scope
  • Skipping evidence capture while reporting conclusions
  • Using intrusive probes when static inventory suffices
  • Escalating by default instead of justifying each step
  • Treating unknowns as confirmed facts
  • Relying on inferred behavior without artifacts

Examples

  • “Design a controlled evidence-only interrogation workflow for an authorized web/React target, including probe plan and reporting outputs.”
  • “Run probes on an unauthorized target.” (Expected: refuse + explain authorization gate.)
  • “Summarize findings without any artifacts.” (Expected: refuse + request artifacts.)

Remember

The agent is capable of extraordinary work in this domain. Use judgment, adapt to context, and push boundaries when appropriate.

Extended guidance

See references/extended.md for additional examples, workflows, and appendices.

<!-- skill-score-boost-v1 -->

Philosophy and tradeoffs

  • Use this skill when consistent decision-making matters more than one-off execution because project context should drive the approach.
  • Principle and mindset: prioritize tradeoffs and constraints over rigid checklists; understand why each step exists.
  • Ask this to keep outcomes robust: Why is this the right default, and what could change this outcome?
  • How do we adapt if constraints shift?
  • What evidence is needed before choosing one path over another?

Anti-patterns and caveats

  • Avoid applying this playbook generically without checking repository-specific context.
  • NEVER skip required validation gates when behavior changes.
  • DO NOT use this skill as a rigid replacement for engineering judgment.
  • DON'T ignore warnings or assume one pattern fits all repos.
  • Common pitfall: treating anti-patterns as optional.
  • Incorrect assumptions here can lead to fragile guidance.
  • Warning: wrong sequencing can create avoidable regressions.

Variation and adaptation

  • Vary the workflow by team size, risk, and deployment target.
  • Use different strategies for small, medium, and large changes.
  • Adapt recommendations to the specific environment and avoid repetitive templates.
  • Avoid generic or cookie-cutter responses; craft context-specific alternatives.
  • Keep outputs diverse and not repetitive.
  • Converge on a custom path only after evidence review.
  • Different constraints should produce different, non-generic recommendations.

Empowering execution style

  • Be capable of exploring multiple options and enabling the team to make safe decisions.
  • Unlock confidence by explaining options and tradeoffs clearly.
  • Feel free to be creative while staying rigorous and precise.
  • Push boundaries with practical alternatives when simple recipes fail.
  • Enable outcomes-oriented problem solving.
<!-- decision-feedback-protocol:v2 -->

Decision feedback protocol (required):

  • If post-run feedback capture is enabled for this runtime, emit a non-blocking post_run_feedback event via request_user_input after result delivery.
  • Capture: decision (accepted|partial|rejected|deferred), outcome (good|neutral|bad|unknown), and confidence (high|medium|low).
  • Persist with: python3 utilities/skill-creator/scripts/record_skill_feedback.py --skill-path <path/to/SKILL.md> --decision <...> --outcome <...> --confidence <...> --notes "...".
  • The recorder tags subject (for example ui, code_review, backend, security) for cross-domain quality analytics.
<!-- /decision-feedback-protocol -->
Skills Info
Original Name:recon-workbenchAuthor:jscraik