QFAI (Quality-First AI)

QFAI is a quality-first development kit for AI coding agents. Its purpose is to improve the quality of AI-generated software outputs by enforcing a structured workflow and validating traceability.

Modern AI coding agents can write code quickly, but they can also misunderstand requirements, drift from intended behavior, or “sound correct” while being wrong. QFAI addresses these failure modes by standardizing an end-to-end delivery loop and forcing objective checks.

SDD clarifies what to build, so the agent does not invent requirements while coding.
ATDD defines acceptance goals as executable scenarios, so correctness is measured rather than assumed.
TDD enables a self-correcting loop: implement → run tests → fix → repeat.
Traceability validation enforces that SDD → ATDD → TDD → implementation stays aligned, reducing hallucination-driven drift.
Result: higher output quality, fewer review cycles, and lower human supervision cost.

QFAI is designed for a skills-driven operating model: engineers select a prepared custom skill and provide only the task intent. The agent reads the repository, produces the required artifacts, and iterates until the hard gates pass.

Release status

Release posture: runtime truthfulness is enforced.
Prototyping is UI-only and runs a single-thread evolution loop driven by qfai prototyping iterate --cycle <n>, with deterministic stop conditions (exit codes 0 continue / 64 convergence / 65 max-iterations / 2 input error).
Runtime observation is observed-only (no synthetic 200 / API / DB prototyping coverage).
Per-iter evidence is screenshot.png + index.html per declared screen plus a single review.json (4-axis ordinal, prose critique, anti-slop detection, pivot directive).
Calibration SSOT is the calibration pack referenced by calibrationRef.packPath.
Current repo note: some repo-wide qfai validate --fail-on error blockers still come from historical review/evidence/ATDD/TDD artifacts and are being cleaned incrementally.

Quick start

Windows users: qfai init creates symlinks internally. You must enable Developer Mode (Settings → System → For developers → Developer Mode: ON) before running npx qfai init, otherwise symlink creation will fail due to insufficient privileges.

# 1) Initialize QFAI assets in your repository
npx qfai init

# 2) Validate traceability (use this in CI as a hard gate)
npx qfai validate

# 3) Generate a human-readable report (Markdown)
npx qfai report

What you can do (CLI commands)

npx qfai init
- Creates the QFAI workspace under .qfai/ (requirements/specs/contracts/report) and installs the AI assistant kit (assistant/ with the 4-layer tree — constitution/, manifest/, catalog/, process/ — plus agents/ and skills/), plus qfai.config.yaml.
npx qfai validate
- Validates specs/contracts/scenarios/traceability and review artifacts (.qfai/review/review-*/summary.json + minimum schema), writes .qfai/report/validate.json, and appends run logs to .qfai/report/run-*/; use --fail-on error (or --fail-on warning) to turn it into a CI gate, and --format github to emit GitHub-friendly annotations. Use --profile discussion|sdd|prototyping|atdd|tdd|verify for local skill-owned checks; CI should use default/full validation (or verify / tdd for the dedicated CI gates).
npx qfai report
- Produces a human-readable report (report.md by default) or an internal JSON export (report.json) from validate.json; use --base-url to link file paths in Markdown to your repository viewer.
npx qfai doctor
- Diagnoses configuration discovery, path resolution, glob scanning, and validate.json inputs before running validate/report; use --fail-on to enforce failures in CI. Use --profile prototyping to add prototyping-specific preflight checks for the primary spec, UI contracts, design contract readiness, active agent-wrapper integrations, shipped role-input readiness, Playwright CLI launcher resolution/probing, and target URL reachability. Note: prototyping evidence (.qfai/evidence/prototyping/prototyping.json) is produced by the AI workflow / skills (/qfai-prototyping), not by a general-purpose end-user CLI flow. Use npx qfai prototyping preflight --target-url <url> for a focused prototyping preflight before the skill starts; it now surfaces blocking QFAI-DCON-* design-contract issues alongside runtime assumptions, resolves a runnable Playwright CLI launcher (project wrapper / local bin / PATH / npx --no-install), and still treats the first real delegation failure as a runtime hard-stop. Use npx qfai prototyping iterate --cycle <n> --target-url <url> to drive each cycle of the single-thread evolution loop. Exit codes: 0 (continue), 64 (convergence), 65 (max-iterations), 2 (input error). qfai validate consumes the resulting evidence files (.qfai/evidence/prototyping/prototyping.json). Traceability refs inside prototyping evidence must use repo-root-relative concrete artifact refs (for example .qfai/specs/spec-0001/01_Spec.md#L3 or .qfai/evidence/render.json#/screens/0). Absolute paths are invalid. The same strict ref grammar is enforced for top-level and leaf evidence-bearing fields, including runtimeGate.evidenceRefs, runtimeGate.ui[].declaredRef, runtimeGate.ui[].renderEvidenceRefs[], runtimeGate.ui[].browserQaEvidenceRefs[], specs[].coverageRefs[].declaredRef, specs[].coverageRefs[].observedRefs[], fullHarness.iterations[].evidenceRefs.runtimeGate, fullHarness.iterations[].evidenceRefs.specCoverage, fullHarness.iterations[].evidenceRefs.render, fullHarness.iterations[].evidenceRefs.browserQa, fullHarness.iterations[].evidenceRefs.uiObservation, fullHarness.iterations[].evidenceRefs.discussion, fullHarness.iterations[].evidenceRefs.screenContract, fullHarness.iterations[].evidenceRefs.trend, fullHarness.iterations[].l1.axes[].evidenceRefs[], fullHarness.iterations[].l2.axes[].evidenceRefs[], and fullHarness.reviewerLogs[].evidenceRefs[]. Semantic rules are also strict: runtimeGate.ui[].declaredRef and fullHarness.iterations[].evidenceRefs.screenContract[] must use the canonical screen contract sourceRef .qfai/discussion/<pack>/uiux/40_screen_contracts.md#<screenId>, and specs[].coverageRefs[].declaredRef must use the canonical spec declaration form .qfai/specs/<specId>/01_Spec.md#L<line> (for example .qfai/specs/spec-0001/01_Spec.md#L3); notes.md, appendix.md, anchor-fragment forms such as #route-home, discussion refs, and screen contract refs are NOT valid declaredRef values. fullHarness follows a terminal-first state machine: status="in-progress" requires finalDecision="pending", reviewerSignoff.status="pending", and no terminationReason; status="completed" requires terminationReason, a non-pending finalDecision, and a terminal reviewerSignoff.

ATDD annotation hard gate

qfai validate enforces spec-to-test traceability with directory-based rules.

tests/e2e/**: annotate all covered user stories with concrete IDs such as QFAI:SPEC-0001:US-0001.
tests/integration/**: annotate all covered test cases with concrete IDs such as QFAI:SPEC-0001:TC-0001.
tests/api/**: annotate all covered API contracts with concrete IDs such as QFAI:CON-API-0001.
tests/api/** and tests/e2e/** must not use TC annotations.
AC annotations are not required in code; AC coverage is treated as indirect through full TC coverage.

Operating model (skills-driven workflow)

QFAI assumes you operate the project primarily via prepared custom skills. A custom skill is a reusable task instruction set for your AI coding agent. The agent reads QFAI assets under .qfai/assistant/ and produces or updates SDD/ATDD/TDD artifacts and code.

Where the skills live

QFAI canonical skills (SSOT): .qfai/assistant/skills/** (may be overwritten when you re-run qfai init --force).
QFAI no longer creates local override scaffolds. Project-specific guidance should live in your repository's normal agent docs or be created explicitly by your AI workflow.

Minimal custom skill set

QFAI includes a small set of custom skills (stored under .qfai/assistant/skills/) designed to keep the workflow opinionated and repeatable.

qfai-configure: Analyze the repository (language, frameworks, test layout, directory structure) and tailor qfai.config.yaml accordingly (especially testFileGlobs). Run this once right after npx qfai init, and re-run it when the repository structure changes.
qfai-discussion: Run a unified structured discussion that produces and maintains the latest discussion pack as 15 required markdown files under .qfai/discussion/discussion-<ts>/. UI-bearing discussion packs may include prototyping.yaml as an optional recommendation artifact; non-ui discussion packs typically omit it.
qfai-sdd: Unified SDD entrypoint with discussion-pack preflight guard (missing/incomplete/blocking OQ causes stop + next action guidance). After preflight, the skill runs a mandatory Stage 1 Triage that classifies every incoming requirement into one of 8 first-class operations (CREATE / UPDATE:APPEND / UPDATE:MODIFY / UPDATE:REMOVE / DELETE / SPLIT / MERGE / SUPERSEDE) with an append-first bias: existing active specs absorb the change unless there is zero subject-token overlap. CREATE / DELETE / SPLIT / MERGE / SUPERSEDE / UPDATE:REMOVE require explicit AskUserQuestion approval, and CREATE rows must register a new CAP-NNNN in .qfai/specs/_policies/03_Capabilities.md before the row is accepted (QFAI-TRIAGE-006). Every 01_Spec.md declares a lifecycle Status: active | superseded | deprecated | removed (QFAI-STATUS-001..006).
qfai-prototyping: Single-thread design evolution loop. One prototype iterated through up to 15 cycles of generate -> capture -> review with a 4-axis ordinal rubric, anti-slop detection, prose critique, and explicit pivot permission. Stops deterministically when all four axes hit exceptional (exit 64) or the iteration budget is exhausted (exit 65).
qfai-atdd: Implement acceptance tests driven by specs/scenarios.
qfai-implement: Unified TDD micro-cycle (Red/Green/Refactor) one test at a time using test-list.md as the execution ledger, including ledger status updates and exception closure.
qfai-verify: Run full-scan local quality gates (validate --fail-on error, report, repo gates) and produce reviewer-approved evidence under .qfai/evidence/.

Workflow sequence (example)

This sequence shows which skill to run, in what order, and what artifacts to expect.

sequenceDiagram
participant U as User
participant AG as AI Agent
participant Q as QFAI Kit (.qfai)
participant R as Repo (codebase)

U->>R: Create a repo (or open an existing one)
U->>R: Run npx qfai init
R-->>U: .qfai kit installed (4-layer assistant tree + skills + agents)

U->>AG: Run /qfai-configure
AG->>Q: Read .qfai/assistant/skills/qfai-configure/SKILL.md
AG->>R: Update qfai.config.yaml (testFileGlobs, etc.)
AG-->>U: Config tuned to this repo

opt If you only have an idea
U->>AG: Run /qfai-discussion
AG-->>U: Structured discussion package (.qfai/discussion/discussion-<ts>/)
end

U->>AG: Run /qfai-sdd
AG->>Q: Read .qfai/assistant/skills/qfai-sdd/SKILL.md
AG->>R: Preflight + create/refine layered specs + finalize 10_Plan + 09_delta
AG-->>U: SDD artifacts ready

U->>AG: Run /qfai-prototyping
AG->>Q: Read .qfai/assistant/skills/qfai-prototyping/SKILL.md
AG->>R: Build contract-aligned implementation skeleton
AG-->>U: Prototype ready

U->>AG: Run /qfai-atdd
AG->>Q: Read .qfai/assistant/skills/qfai-atdd/SKILL.md
AG->>R: Implement acceptance tests
AG-->>U: ATDD tests ready

U->>AG: Run /qfai-implement
AG->>Q: Read .qfai/assistant/skills/qfai-implement/SKILL.md
AG->>R: Execute TDD micro-cycle (Red/Green/Refactor) per test-list.md
AG-->>U: Implementation complete

U->>AG: Run /qfai-verify
AG->>Q: Read .qfai/assistant/skills/qfai-verify/SKILL.md
AG->>R: Run quality gates and summarize evidence
AG-->>U: Verification summary ready

U->>R: Run npx qfai validate
U->>R: Run npx qfai report
R-->>U: Traceability checks and report artifacts

Operational notes.

Each custom skill must output in the user’s language (absolute requirement).
Each custom skill must end with a completion message that enumerates all available next actions and clearly states what to do for each option.
Except qfai-discussion, each skill must analyze the project context (architecture, tech stack, test framework, repo structure) before generating artifacts or code.
Skills should delegate work to multiple role-based sub-agents (Planner, Architect, Contract Designer, QA, Code Reviewer, etc.) to emulate a real delivery flow.
Change classification (Primary/Tags) is required in 09_delta.md and recommended in PRs. See .qfai/assistant/constitution/change-classification.md.
Verification planning is recorded in 09_delta.md (Verification -> Plan) and validated in CI (VFY-* rules).
Review gate policies (required/optional layers and reviewers) are defined in .qfai/assistant/catalog/review-gate.rules.yml.
Agent taxonomy and invocation SSOT are defined in .qfai/assistant/manifest/agent-catalog.yml, .qfai/assistant/manifest/agent-routing.yml, and .qfai/assistant/manifest/review-profiles.yml.

Configuration

Configuration is stored at the repository root as qfai.config.yaml; you can change paths, traceability policies, and CI gate thresholds.

Example: override paths and traceability globs.

paths:
  contractsDir: .qfai/contracts
  specsDir: .qfai/specs
  discussionDir: .qfai/discussion
  outDir: .qfai/report
  skillsDir: .qfai/assistant/skills
  srcDir: src
  testsDir: tests
validation:
  failOn: error # error | warning | never
  traceability:
    testFileGlobs:
      - "src/**/*.test.ts"
      - "tests/**/*.spec.ts"
    testFileExcludeGlobs:
      - "**/fixtures/**"
    scMustHaveTest: true
    scNoTestSeverity: warning # error | warning

Notes.

validate.json, report.json, doctor.json, and run-* JSON logs are internal exports and are not a stable external contract; prefer report.md for integrations that must survive tool upgrades.
Scenario files are expected to use the Gherkin extension *.feature (not *.md).
prototyping.calibration.packPath points to the calibration pack SSOT; runtime and validator both resolve thresholds and iteration parameters from that pack.
prototyping.calibration.thresholds, maxIterations, plateauDelta, and plateauLookback are unsupported public config fields. Put calibration values in the referenced pack instead of qfai.config.yaml.
Observability modules (src/core/observability/) exist as foundation code but are not yet integrated into blocking validation. They are reserved for future operational instrumentation.

Specifications and contracts (SDD)

QFAI uses a small, opinionated set of artifacts to reduce ambiguity and prevent agents from “inventing” behavior.

Requirements: what you want to achieve, constraints, and explicit non-goals.
Specs: structured expected behaviors, inputs/outputs, edge cases, and invariants.
Contracts:
- UI contracts: YAML (.yaml / .yml)
- API contracts: YAML (.yaml / .yml)
- DB contracts: SQL (.sql)
Scenarios (ATDD): Gherkin .feature files

Traceability is validated across these artifacts, so code changes remain grounded in the specs and the tests prove compliance.

SSOT boundaries

flowchart LR
  S[".qfai/specs/** (layered 01..10)"] --> V["qfai validate"]
  C[".qfai/contracts/**"] --> V
  V --> R[".qfai/report/**"]

Specs SSOT: .qfai/specs/** (layered files 01_Spec.md..09_delta.md + shared delta layer)
Contracts SSOT: .qfai/contracts/**
Report outputs (.qfai/report/**) are derived artifacts and not SSOT.

Minimal tutorial

npx qfai init
Run /qfai-discussion to structure scope, open questions, and produce a discussion pack under .qfai/discussion/discussion-<ts>/.
Run /qfai-sdd to build layered specs and finalized plans.
For each completed review cycle, append artifacts under .qfai/review/review-<timestamp>/.
Run npx qfai validate then npx qfai report.

Release gate behavior:

Merge gate: qfai validate must pass (error=0), and open OQ is warning.
Release gate: set release_candidate: true in the Initiative layer (03_Initiative.md); open OQ then becomes error.

FAQ

Q: I referenced AC/TC directly from upper layers and got an error.
- A: Keep upper-to-lower references out of upper docs; use 16_Traceability-ledger.md for cross-layer linkage.
Q: Ledger validation fails with missing columns.
- A: Ensure required columns exist: trace_id,obj_id,init_id,cap_id,flow_id,us_id,ac_id,ex_ids,tc_ids.
Q: 09_delta.md fails validation.
- A: Include all required sections (Change Summary, Rationale, Candidates Considered, Adopted, Rejected, Impact, Follow-ups) and include both DO NOT and Temptation in Rejected.
Q: release_candidate validation fails due open questions.
- A: Keep specs definition-only, use .qfai/report/run-* as execution logs, and convert open OQ to resolved or deferred with evidence.
Q: qfai validate reports QFAI-STATUS-001 ("Status bullet が見つかりません") on every spec.
- A: Each 01_Spec.md must declare - Status: active | superseded | deprecated | removed (introduced in 1.8.8). Add Status: active for currently-authoritative specs; superseded specs need a - Superseded-by: spec-NNNN companion bullet, and deprecated/removed specs need - Deprecated-at: YYYY-MM-DD. The previous QFAI-STATUS-001 (status-leak guard) was renamed to QFAI-STATUSLEAK-001 to free the namespace.
Q: /qfai-sdd is asking for AskUserQuestion approval that earlier versions never asked for.
- A: Stage 1 Triage classifies each requirement into one of 8 first-class operations and gates approval-required ops (CREATE / DELETE / SPLIT / MERGE / SUPERSEDE / UPDATE:REMOVE) on explicit user confirmation. Append-first means UPDATE:APPEND on an existing active spec is the default; CREATE additionally requires a new CAP-NNNN row in .qfai/specs/_policies/03_Capabilities.md before the row is accepted (QFAI-TRIAGE-006).
Q: delta.md validation reports QFAI-TRIAGE-001 ("Change Summary はあるが Triage がありません") as a warning.
- A: 1.8.8 introduced a ## Triage section requirement. Existing operational deltas without it currently fail soft (warning); future minor versions will promote this to an error after operational backfill.

Continuous integration

QFAI generates integration wrappers under .agents/**, .claude/**, .github/**, and .codex/**. It does not generate GitHub Actions workflows. Configure CI in your own platform and run:

pnpm ci:gate
pnpm check-types:future
# or, minimum gate only:
npx qfai validate --fail-on error

Recommended baseline.

Keep CI on default/full validation (qfai validate --fail-on error or qfai validate --profile verify --fail-on error); do not use partial profiles in CI.
Keep pnpm check-types:future as a separate mandatory gate so future TS compatibility runs once without duplicating pnpm ci:gate.
Add a report step (npx qfai report) when you need a human-readable artifact.
Tune traceability globs in qfai.config.yaml to match your test layout.

Waiver policy.

Use waivers only for warning / info findings (false positives).
Waivers that target error findings are invalid and fail validation (QFAI-WAIVER-002).
Expired waivers are reported as warnings (QFAI-WAIVER-003) and must be renewed or removed with evidence.
Suppressed findings remain visible in reports as suppressed=true; waivers do not erase findings.

Typical customizations.

Add a doctor step before validate if you want to fail fast on path/glob/config issues.
Publish .qfai/report/validate.json, report.md, and relevant .qfai/report/run-*/ logs as CI artifacts.

Generated structure

npx qfai init generates the following structure in your repository.

.
├── .agents
│   ├── README.md
│   └── skills
│       └── qfai-configure
│           └── SKILL.md
├── .qfai
│   ├── assistant
│   │   ├── agents
│   │   │   ├── acceptance-test-engineer.md
│   │   │   ├── architecture-reviewer.md
│   │   │   ├── backend-engineer.md
│   │   │   ├── completion-reviewer.md
│   │   │   ├── delivery-planner.md
│   │   │   ├── devops-ci-engineer.md
│   │   │   ├── discovery-analyst.md
│   │   │   ├── doc-steward.md
│   │   │   ├── frontend-engineer.md
│   │   │   ├── implementation-reviewer.md
│   │   │   ├── orchestrator.md
│   │   │   ├── product-experience-architect.md
│   │   │   ├── product-surface-reviewer.md
│   │   │   ├── qa-gatekeeper.md
│   │   │   ├── qa-strategist.md
│   │   │   ├── requirements-analyst.md
│   │   │   ├── requirements-reviewer.md
│   │   │   ├── solution-architect.md
│   │   │   └── test-design-analyst.md
│   │   ├── constitution
│   │   │   ├── agent-selection.md
│   │   │   ├── change-classification.md
│   │   │   ├── communication.md
│   │   │   ├── constitution.md
│   │   │   ├── drift-protocol.md
│   │   │   ├── quality.md
│   │   │   ├── requirements-decomposition.md
│   │   │   ├── research-first-protocol.md
│   │   │   ├── shared-skill-delegation-baseline.md
│   │   │   ├── shared-skill-operating-baseline.md
│   │   │   ├── thinking.md
│   │   │   └── workflow.md
│   │   ├── manifest
│   │   │   ├── agent-catalog.yml
│   │   │   ├── agent-routing.yml
│   │   │   └── review-profiles.yml
│   │   ├── process
│   │   │   └── migrations
│   │   │       └── v<X.Y.Z>-<topic>.md
│   │   ├── skills
│   │   │   ├── qfai-configure
│   │   │   │   └── SKILL.md
│   │   │   ├── qfai-discussion
│   │   │   │   ├── references
│   │   │   │   │   └── rcp_footer.md
│   │   │   │   └── SKILL.md
│   │   │   ├── qfai-prototyping
│   │   │   │   └── SKILL.md
│   │   │   ├── qfai-sdd
│   │   │   │   ├── references
│   │   │   │   │   └── rcp_footer.md
│   │   │   │   └── SKILL.md
│   │   │   ├── qfai-atdd
│   │   │   │   └── SKILL.md
│   │   │   ├── qfai-implement
│   │   │   │   └── SKILL.md
│   │   │   └── qfai-verify
│   │   │       └── SKILL.md
│   │   └── catalog
│   │       ├── cli-ux-guidelines.md
│   │       ├── manifest.md
│   │       ├── product.md
│   │       ├── review-gate.rules.yml
│   │       ├── spec_required_files.json
│   │       ├── structure.md
│   │       ├── tech.md
│   │       ├── test-layers.md
│   │       └── ui-definition-protocol.md
│   └── waivers.yml
└── qfai.config.yaml

qfai init does not seed .qfai workflow artifacts such as specs, discussions, contracts, evidence, reports, reviews, placeholder spec directories, or artifact README files. Those files are created later by QFAI skills when real work exists.

Integration wrappers are also generated for immediate use:

Agents/Codex VS Code: .agents/skills/**
Claude Code: .claude/skills/**, .claude/agents/**
GitHub Copilot: .github/skills/**, .github/agents/**
Codex: .codex/skills/**

Agent integrations

npx qfai init installs canonical skills under .qfai/assistant/skills/** (SSOT) and generates thin wrapper assets for Agents/Codex VS Code / Copilot / Claude Code / Codex. Canonical agent markdown under .qfai/assistant/agents/** uses a shared YAML frontmatter subset (name, description, tools) compatible with Claude Code and GitHub Copilot, while Codex consumes mirrored .codex/agents/*.toml profiles. If wrapper assets drift from canonical skills, rerun npx qfai init --force to resync.

Contributing (for QFAI maintainers)

This repository is a monorepo, and the distributable package is under packages/qfai; if you change documentation, keep the repository root README and the package README aligned (the CI enforces this).

License

MIT

Name	qfai-scenario-test
Description	QFAI: Implement scenario tests (legacy entrypoint)

qfai-scenario-test

SKILL.md