autonomous-dev

8-Agent SDLC Pipeline for Claude Code

Transform feature implementation into an automated, quality-enforced workflow. One command runs research → plan → test → implement → review → security → docs → git automation.

The Problem

AI coding assistants are powerful, but they drift. They build features you didn't ask for. They ignore your architecture. They forget your constraints.

Without guardrails, you get:

Scope creep - Features that don't align with project goals
23% bug rate - Code that wasn't tested first
12% security vulnerabilities - No automated auditing
67% documentation drift - Docs that don't match code

The Solution: Intent-Aligned Development

autonomous-dev keeps AI aligned to YOUR intent. Every feature validates against your project's strategic goals before a single line of code is written.

PROJECT.md-First Development

Define your intent once. Enforce it automatically.

# .claude/PROJECT.md

## GOALS
- Build a secure authentication system
- Maintain sub-100ms API response times

## SCOPE
- User management, session handling
- NOT: Payment processing, analytics

## CONSTRAINTS
- No external auth providers
- Must support offline mode

## ARCHITECTURE
- REST API with JWT tokens
- PostgreSQL for persistence

Every /implement command validates:

Does this feature align with GOALS?
Is it within SCOPE?
Does it violate CONSTRAINTS?
Does it follow ARCHITECTURE?

Result: The AI builds what you actually want, not what it thinks you want.

The Numbers

Metric	Without Pipeline	With `/implement`
Scope alignment	Manual review	Automatic validation
Bug rate	23%	4%
Security issues	12%	0.3%
Documentation drift	67%	2%
Test coverage	43%	94%

85% of issues caught before commit.

Quick Start

# Install (30 seconds)
bash <(curl -sSL https://raw.githubusercontent.com/akaszubski/autonomous-dev/master/install.sh)

# First install: Restart Claude Code (Cmd+Q / Ctrl+Q)
# Subsequent updates: /reload-plugins (reloads commands, agents, skills)
# Note: /reload-plugins does NOT reload hooks or settings — use full restart for those
/setup  # Guided PROJECT.md creation

How It Works

One Command, Full Pipeline

/implement Add user authentication with JWT tokens

autonomous-dev orchestrates 8 specialized AI agents in sequence:

STEP 1: Alignment     → Validates against PROJECT.md goals
STEP 2: Research      → Finds existing patterns in your codebase
STEP 3: Planning      → Designs the architecture
STEP 4: TDD Tests     → Writes failing tests FIRST
STEP 5: Implementation → Makes tests pass
STEP 6: Parallel Validation (3 agents simultaneously):
        ├── Code Review    → Checks quality and patterns
        ├── Security Audit → Scans for OWASP vulnerabilities
        └── Documentation  → Keeps docs in sync
STEP 7: Git Automation → Commit, push, PR, close issue

Result: Production-ready code in 15-25 minutes.

Key Commands

Command	Purpose
`/setup`	Interactive PROJECT.md creation wizard
`/implement`	Full pipeline: align, research, plan, test, implement, review, secure, document
`/implement --quick`	Fast mode: implementer agent only (2-5 min)
`/implement --batch`	Process multiple features from file
`/implement --issues 1 2 3`	Process features from GitHub issues
`/implement --resume`	Continue interrupted batch
`/align`	Validate alignment (project goals, CLAUDE.md, retrofit)
`/create-issue`	Research-backed GitHub issues with duplicate detection
`/advise`	Critical analysis before major decisions
`/audit-tests`	Identify untested code paths
`/audit-claude`	Validate CLAUDE.md structure against best practices
`/health-check`	Validate all plugin components (agents, hooks, commands)
`/sync`	Update plugin from marketplace
`/worktree`	Manage git worktrees for isolated development
`/improve`	Analyze sessions for drift, test gaps, doc staleness
`/scaffold-genai-uat`	Scaffold LLM-as-judge tests into any repo

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                      /implement Pipeline                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   alignment → researcher → planner → test-master → implementer  │
│                                          ↓                       │
│                    ┌─────────────────────┴─────────────────────┐│
│                    │         Parallel Validation               ││
│                    │  reviewer + security-auditor + doc-master ││
│                    │           (60% faster)                    ││
│                    └───────────────────────────────────────────┘│
│                                          ↓                       │
│                              git automation                      │
│                     (commit → push → PR → close issue)           │
│                                                                  │
├─────────────────────────────────────────────────────────────────┤
│  8 Pipeline Agents  │  Skills  │  Hooks  │  Libraries │
└─────────────────────────────────────────────────────────────────┘

Agent Architecture

8 Pipeline Agents (invoked via Task tool):

researcher-local (Haiku) - Searches codebase for existing patterns
researcher (Sonnet) - Researches best practices and security considerations
planner (Sonnet) - Designs implementation architecture
test-master (Sonnet) - Writes comprehensive tests BEFORE implementation (TDD)
implementer (Sonnet) - Writes production-quality code to make tests pass
reviewer (Sonnet) - Reviews code quality, patterns, and coverage
security-auditor (Haiku) - Scans for OWASP vulnerabilities
doc-master (Haiku) - Updates documentation to match code changes

Utility Agents (8 more):

commit-message-generator, data-curator, data-quality-validator, distributed-training-coordinator, issue-creator, quality-validator, sync-validator, test-coverage-auditor

How Agents Work:

Agents are markdown prompts (not Python files)
Invoked via Claude's Task tool with subagent_type parameter
Each agent has specialized knowledge and tools
Pipeline runs sequentially (except parallel validation in step 6)

Model Tier Strategy (Cost-Optimized)

Tier	Model	Agents	Cost Savings
Tier 1	Haiku	researcher-local, reviewer, doc-master, security-auditor	40-60% savings
Tier 2	Sonnet	implementer, test-master, planner, researcher-web	Balanced cost/quality
Tier 3	Opus	(reserved for future complex reasoning tasks)	Deep reasoning when needed

Active Automation Hooks

Hooks run automatically at key moments to enforce quality without manual intervention:

Hook Type	What It Does
PreToolUse	4-layer security validation (sandboxing → MCP → agent auth → batch permissions)
PreCommit	Blocks commits with failing tests, missing docs, or security issues
SubagentStop	Triggers git automation after pipeline completion
PrePromptSubmit	Enforces workflow discipline and command validation
SessionStart	Restores session state after `/clear` for continuity

Key Hooks:

unified_pre_tool.py: 4-layer security (84% reduction in permission prompts)
stop_quality_gate.py: End-of-turn quality checks (pytest, ruff, mypy)
enforce_tdd.py: TDD workflow enforcement (tests before code)
enforce_orchestrator.py: PROJECT.md alignment validation
unified_session_tracker.py: Session state persistence across /clear operations

Hook Exit Code Semantics:

EXIT_SUCCESS (0): Hook passed, continue execution
EXIT_WARNING (1): Non-blocking warning, log and continue
EXIT_BLOCK (2): Critical failure, block operation
Lifecycle hooks (PreToolUse, SubagentStop) must exit 0 (blocking not allowed)

What hooks catch automatically:

Documentation drift from code changes
Secrets accidentally staged for commit
git push --force to protected branches
Missing test coverage on new code
CLAUDE.md out of sync with codebase
Path traversal and injection attacks
Agent execution failures (auto-retry with circuit breaker)

Philosophy: Quality gates should be automatic. If you have to remember to check something, you'll eventually forget.

Zero Manual Git Operations

After pipeline completion, autonomous-dev handles everything:

Feature completes → Generate commit message → Stage & commit
                 → Push to remote → Create PR → Close GitHub issue

Enabled by default with first-run consent. Configure via .env:

AUTO_GIT_ENABLED=true   # Master switch (default: true)
AUTO_GIT_PUSH=true      # Auto-push (default: true)
AUTO_GIT_PR=true        # Auto-create PRs (default: true)

Features:

Conventional commit messages generated by AI
Co-authorship footer included
PR descriptions with summary, test plan, related issues
Automatic GitHub issue closing with workflow summary

Self-Validation Quality Gates

autonomous-dev enforces its own quality standards. When the plugin detects it's running in the autonomous-dev repository, stricter gates activate automatically:

In autonomous-dev Repository:

Coverage Threshold: 80% (vs 70% for user projects)
No Bypass: Cannot override quality gates with --no-verify
Automatic Enforcement: Enabled without configuration needed

Quality Gates Enforced:

Gate	Standard	autonomous-dev
Test Coverage	70%	80%
TDD Requirement	Suggested	Required
Pre-commit Checks	Optional	Mandatory
Documentation Sync	Auto-checked	Strictly enforced

How It Works:

Hooks detect autonomous-dev repository via plugins/autonomous-dev/manifest.json
Auto-detect based on file structure (survives worktrees, CI/CD)
Apply stricter thresholds in enforcement hooks
Block commits that don't meet standards (no bypass possible)

Philosophy: "We practice what we preach." If autonomous-dev requires 80% coverage from you, it requires it from itself.

Batch Processing

Process 50+ features without manual intervention:

From File

# features.txt
Add JWT authentication
Add password reset (requires JWT)
Add email notifications

/implement --batch features.txt

From GitHub Issues

/implement --issues 72 73 74

Auto-Continuation (Issue #285)

New: Batch now automatically continues through all features in a single invocation.

/implement --batch features.txt
# Processes Features 1/5 → 2/5 → 3/5 → 4/5 → 5/5 automatically
# No manual `/implement --resume` needed between features
# Manual resume only needed if batch is interrupted (not between features)

Why This Matters: Previously, batch processing stopped after each feature requiring manual intervention. Now features auto-continue without interruption. Failed features are recorded but don't stop the batch.

Smart Features

Dependency Analysis: Automatically reorders features based on dependencies (Issue #157)

Original: [tests, auth, email]
Optimized: [auth, email, tests]  # Tests run after implementation

Analyzes feature dependencies via keyword matching
Topological sort for optimal execution order
Circular dependency detection with ASCII graph visualization
Preserves explicit ordering when dependencies are ambiguous

Checkpoint/Resume: Automatic session snapshots with safe resume capability (Issues #276, #277)

# Batch processing automatically creates checkpoints after each feature
# If interrupted, resume from checkpoint:
/implement --resume batch-20260110-143022

# Rollback to previous checkpoint if needed:
/implement --rollback batch-20260110-143022 --previous

Context Management:

Claude auto-compact at ~185K tokens (23% increase from 150K baseline)
Batch checkpoints after every feature (state + progress saved)
SessionStart hook auto-resumes after /clear or auto-compact
Corrupted checkpoint recovery with .bak fallback
Token tracking: context_token_delta per feature for threshold detection

Automatic Retry: Transient failures (network, rate limits) retry automatically. Permanent errors (syntax, type) skip immediately.

Per-Feature Git: Each feature commits separately with conventional messages.

Issue Auto-Close: GitHub issues closed automatically after push with summary comment.

GitHub Actions Integration

Automate PR reviews and issue implementation with Claude directly in your GitHub workflow.

Workflow	Trigger	What It Does
Claude Code Review	PR opened/updated, `@claude` comment	Automated code review against project conventions
Claude Issue Implementation	Issue labeled `claude-implement`	Reads issue, implements solution, opens PR

See docs/GITHUB-ACTIONS.md for setup and configuration.

Advanced Features

Ralph Loop: Self-Correcting Agent Execution

Agents automatically retry failed operations with intelligent validation:

Agent fails → Analyze error → Adjust strategy → Retry (up to 3x)
           ↓
     Circuit breaker prevents infinite loops

5 Validation Strategies:

pytest: Test-based validation (tests must pass)
safe_word: AI confirms completion in response
file_existence: Required files must exist
regex: Output must match pattern
json: Output must be valid JSON

Features:

Token limits prevent runaway costs
Circuit breaker after 3 failures
Validation strategy per agent type
Default: ENABLED for all pipeline agents

Quality Persistence: Honest Batch Summaries

Batch processing enforces 100% integrity:

Rules:

100% test pass requirement (not 80%)
Failed features keep GitHub issues OPEN + 'blocked' label
Retry escalation: 3 attempts with increasing wait times
Honest summaries: Never fakes success

Why This Matters: Prevents "silent failures" where batch reports success but issues remain broken.

Worktree Isolation

Per-batch worktree creation for parallel development:

/implement --batch features.txt
# Creates: .worktrees/batch-20260128-143022/
# Automatic CWD change to worktree
# Main repo remains untouched until merge

Benefits:

Concurrent CI jobs without interference
Per-worktree batch state isolation
Safe experimentation (discard without affecting main)
Supports nested worktrees

Session State Persistence

Session state survives /clear operations:

Stored in .claude/local/SESSION_STATE.json:

Active tasks and next steps
Key conventions (repo-specific patterns)
Recent context (files modified, workflows completed)

Read at SessionStart: Automatic continuity across sessions

UV Script Execution

All hooks use UV for reproducible execution:

Features:

PEP 723 metadata blocks (inline dependencies)
Zero environment setup overhead
Graceful fallback to sys.path
Single-file portability

Example (from any hook):

#!/usr/bin/env -S uv run --quiet
# /// script
# dependencies = ["pytest", "coverage"]
# ///

Security-First Architecture

4-Layer Permission Architecture

Layer 1: Sandbox Enforcer      → Command classification (SAFE/BLOCKED/NEEDS_APPROVAL)
Layer 2: MCP Security          → Path traversal, injection, SSRF prevention
Layer 3: Agent Authorization   → Pipeline agent detection
Layer 4: Batch Approver        → Caches user consent for identical operations

Result: 84% reduction in permission prompts (50+ → 8-10).

Security Validations

CWE-22: Path traversal prevention
CWE-78: Command injection blocking
CWE-918: SSRF prevention
Symlink rejection outside whitelist
Credential safety (never logged)
Audit logging to logs/security_audit.log

Performance

Optimization Results

Phase	Improvement
Model optimization (Haiku for research)	3-5 min saved
Prompt simplification	2-4 min saved
Parallel validation	60% faster (5 min → 2 min)
Smart agent selection	95% faster for typos/docs

Current Performance:

Full pipeline: 15-25 minutes per feature
Quick mode: 2-5 minutes
Typo fixes: <2 minutes (95% faster than full pipeline)

Context Management

Automatic context summarization (Claude Code handles 200K token budget)
Session files log to docs/sessions/ (200 tokens vs 5,000+)
/clear recommended after each feature for optimal performance

Documentation

Guide	Description
CLAUDE.md	Project instructions and quick reference
Architecture	Technical architecture deep-dive
Agents	8-agent pipeline + utility agents
Hooks	Active automation hooks reference
Skills	Skills and agent integration
Workflow Discipline	Why pipelines beat direct implementation
Performance	Benchmarks and optimization history
Git Automation	Zero manual git operations
Batch Processing	Multi-feature workflows with auto-continuation
Security	Security model and hardening guide

Requirements

Claude Code 2.0+
macOS or Linux
Git
gh CLI (optional, for PR creation and issue management)

Philosophy

Automation > Reminders > Hope

Quality should be automatic, not optional. autonomous-dev makes the right thing the easy thing:

Research happens automatically before implementation
Tests are written first (TDD enforced)
Security is scanned on every feature
Documentation stays in sync automatically
Git operations are orchestrated end-to-end
Alignment is validated against your stated intent

You describe what you want. The pipeline handles the rest.

The 4-Layer Consistency Architecture

Layer	Weight	Purpose
Hooks	10%	Deterministic blocking (secrets, force push)
CLAUDE.md	30%	Persuasion via data (this README)
Convenience	40%	Quality path is the easy path
Skills	20%	Agent expertise and knowledge

We don't force quality. We make it the path of least resistance.

License

MIT

<p align="center"> <strong>Built for developers who ship.</strong> </p>

Name	skill-integration-templates
Description	Standardized templates and patterns for integrating skills into agent prompts. Reduces token overhead through reusable skill reference syntax, action verbs, and progressive disclosure usage guidelines.

skill-integration-templates

SKILL.md

autonomous-dev

The Problem

The Solution: Intent-Aligned Development

PROJECT.md-First Development

The Numbers

Quick Start

How It Works

One Command, Full Pipeline

Key Commands

Architecture

Agent Architecture

Model Tier Strategy (Cost-Optimized)

Active Automation Hooks

Zero Manual Git Operations

Self-Validation Quality Gates

Batch Processing

From File

From GitHub Issues

Auto-Continuation (Issue #285)

Smart Features

GitHub Actions Integration

Advanced Features

Ralph Loop: Self-Correcting Agent Execution

Quality Persistence: Honest Batch Summaries

Worktree Isolation

Session State Persistence

UV Script Execution

Security-First Architecture

4-Layer Permission Architecture

Security Validations

Performance

Optimization Results

Context Management

Documentation

Requirements

Philosophy

Automation > Reminders > Hope

The 4-Layer Consistency Architecture

License